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'  ABSTRACT 

Estimation  and  detection  of  optical  signals  cHstorted  by  diffraction, 
additive  background  noise,  and  multiplicative  (detection)  noise  are 
studied.  Assuming  that  the  output  of  the  detector  is  a  Poisson  process, 
that  the  signal  and  noise  are  additive,  and  that  they  have  prescribed 
moans  and  covariance  matrices,  the  optimum  linear  estimate  of  the 
*  optical  signal  or  object  is  obtained.  In  the  physical  detection  process, 

the  interaction  between  the  incident  radiation  and  the  detector  produces 
an  effect  called  multiplicative  noise  which  must  be  taken  into  account 
in  obtaining  the  optimum  linear  estimate.  The  performance  of  the 
estimation  procedure  is  evaluated  for  several  special  cases.  Both  white 
and  colored  noise  are  considered  in  the  estimation  problem.  The  problem 
of  discriminating  between  optical  signals  is  considered.  Optimum 
procedures  are  derived  for  detecting  known  and  unknown  optical  signals 
using  fixed-sample  detectors.  The  properties  of  sequential  detectors 
which  are  optimum  for  the  detection  of  random  or  unknown  optical 
signals  are  investigated.  A  comparison  is  made  of  the  average  test 
lengths  of  these  optimum  random  signal  detectors  with  those  of  a 
detector  designed  for  particular  optical  signals.  The  test  lengths  of 
the  fixed-sample  detector  and  sequential  detector  are  compare  for  a 
particular  example. 
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INTRODUCTION 


In  an  optical  system  the  final  image  is  not  an  exact  representation 
of  thfc  original  object*  In  general  the  image  differs  fromthe  object  due  to 
diffraction  and  stray  light  or  additive  background  noise.  The  problem  is 
further  complicated  when  the  image  is  measured.  When  measurements 
are  made,  detection  noise  (multiplicative  noise)  is  introduced. 

In  the  absence  of  any  noise,  with  distortion  due  only  to  diffraction, 
Harris  (1964)  showed  that  the  object  can  in  principle  be  reconstructed 
exactly  if  the  object  is  known  to  be  spatially  bounded.  In  general,  however, 
additive  and  multiplicative  noise  will  be  present  and  will  give  rise  to  error 
in  any  restoration  procedure.  In  establishing  such  a  procedure,  we  need 
to  take  into  account  any  known  statistics  since  the  restoration  procedure 

in  the  presence  of  noise  may  be  different  from  the  procedure  used  when 
noise  is  absent. 

In  this  paper,  methods  of  detecting  and  estimating  optical  signals  which 
have  been  distorted  by  diffraction,  additive  noise,  and  multiplicative  noise 
are  investigated.  The  estimation  procedures  considered  are  the  minimum 
mean-square-error  estimate,  the  maximum  a  posteriori  estimate,  the 
maximum  likelihood  estimate,  and  the  Bayes'  estimate.  The  main 
emphasis  wul  be  on  the  minimum  mean-square-error  estimate.  For  the 
detection  procedures,  both  fixed-sample  detection  and  sequential  detection 
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are  studied.  Comparisons  are  made  between  detecting  known  signals  and 
unknown  signals  to  determine  the  deterioration  in  performance  due  to 
ignorance  about  the  unknown  signals. 
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STATEMENT  OF  THE  PROBLEM 

Throughout  this  paper  the  conditions  necessary  for  Fraunhofer 
diffraction  will  be  assumed  to  be  satisfied  (Born  and  Wolf,  1964;  Stone, 
1963),  In  essence,  these  conditions  require  that  the  effective  distances 
from  a  point  in  the  object  plane  (or  observation  point  in  the  image  plane) 
to  any  two  points  in  the  aperture  plane  differ  by  not  more  than  a  small 
fraction  of  a  wavelength.  Also,  the  radiation  will  be  assumed  to  be 
spatially  incoherent  and  quasi -monochromatic.  By  quasi -monochromatic 
we  mean  that  the  radiation  has  a  frequency  bandwidth  which  is  much 
smaller  than  the  frequency  itself.  Unless  stated  otherwise,  the  observed 
quantities  will  be  number —of -photoelectrons  and  the  estimated  quantities 
will  be  average-number -of -photons  per  unit  time  (mean  rate). 

Under  the  conditions  of  Fraunhofer  diffraction,  the  techniques  of 
Fourier  analysis  can  be  used  to  investigate  the  characteristics  of  the 
optical  system  (O'Neill,  1963).  The  optical  system  can  then  be  treated 
as  a  linear  filter  of  spatial  frequencies  whose  properties  are  described 
by  a  transfer  function  A  (f^,  f^.)  where  £  and  £  are  image -plane  coordin¬ 
ates,  For  incoherent  illumination,  the  spatial  frequency  spectrum  of 
the  image  V  (f^,  f^)  is  found  by  multiplying  the  spatial  frequency  spectrum 
of  the  object  W  (f^,  f^),  where  a  and  (3  are  the  object-plane  coordinates, 
by  the  system  transfer  function  A  (f^,  f^)  (see  Appendix).  Alternately, 
by  the  convolution  theorem,  the  image  intensity  distribution  v(£,  £)  is 


4 

obtained  by  convolving  the  object  intensity  distribution  w(  a ,  p  )  with 
the  point  spread  function  a(  | ,  £  )  of  the  optical  system.  Hence, 


v(t&)*w($,4)*a(£,0  (1) 

where  w(  | ,  £  )  is  the  object  intensity  distribution  referred  to  the  image 
plane  and  *  denotes  convolution. 

The  image  v(  £ ,  £  )  is  further  distorted  by  additive  background 
noise  q(  £  ,  £,  )  and  the  resulting  image  intensity  distribution  is 
r(§  iC)  s  v(£  ,  £  )  +  During  the  detection  of  r(  £  ,  £  ) 

the  interaction  between  the  impinging  radiation  and  the  detector  produces 
a  multiplicative  effect  or  detection  noise  resulting  in  an  image  u(  £  ,  t,  ) 
or  a  stream  of  photoelectrons  2.  Our  objective  is  to  count  the  number 
of  photoelectrons  in  the  output  and  from  this,  estimate  w(£  ,  £  )  or 
discriminate  between  two  alternative  signals  w^(  £  ,  %  )  and 
w^(  £ »  £  )•  The  estimation  and  discrimination  procedures  we  develop 
depend  upon  the  statistics  of  the  additive  and  multiplicative  noise  as  well 
as  any  a  priori  information  available  concerning  the  optical  signals  to 
be  estimated  or  detected. 
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STATISTICAL  MODEL 

Radiation  can  be  observed  only  through  its  interaction  with  matter. 
The  interaction  process  we  will  consider  results  from  the  photolectric 
effect.  The  receptor  in  the  image  plane  will  be  assumed  to  be  a  photo¬ 
sensitive  surface  divided  into  a  large  number  of  very  small  regions  or 
cells.  It  is  assumed  that  the  cells  are  small  enough  that  the  illuminance, 
is  approximately  constant  within  a  given  cell.  Consider  the  radiation 
incident  upon  the  elementary  regions  or  cells  to  be  streams  of  photons 
each  with  energy  hv  where  h  is  Planck's  constant  and  v  is  the  fre¬ 
quency  of  the  incident  radiation.  The  average  number  of  photons  y 
incident  upon  a  cell  in  the  time  interval  r  is  equal  to  the  incident 
energy  of  that  cell  Tr  divided  by  hv,  where  r  is  the  received  intensity 
due  to  the  diffracted  object  and  additive  noise.  The  cells  are  labeled 

with  the  index  i,  and  y.  represents  the  number  of  photons  incident 
th 

upon  the  i  cell.  The  number  of  photoelectrons  z .  emitted  from 
the  i^1  cell  depends  upon  the  incident  energy  and  also  upon  the  multi¬ 
plicative  effect  of  the  receptor.  Because  of  the  stochastic  nature  of 
the  interaction  between  radiation  and  matter,  for  a  given  y.  the 
quantity  z.  is  a  random  variable  rather  than  a  determinintic  quantity 
ar*4  must  be  described  in  probabilistic  terms. 

The  number  of  photoelectrons  z.  emitted  from  each  cell  consitiuten 
the  observed  data.  It  is  assumed  that  the  location  where  each  photo¬ 
electric  event  takes  place  can  be  determined. 
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The  photons  that  strike  the  light  sensitive  surface  of  the  receptor  will 
cause  some  type  of  reaction  that  can  be  measured.  For  example,  in  the 
photographic  film  case,  the  photons  will  cause  many  of  the  silver  halide 
grains  to  become  developable.  The  pattern  that  results  on  the  developed 
photographic  film  will  be  a  measure  of  the  number  of  the  photons  reaclrr  j 
the  image  plane.  In  this  case*  film  grandularity  and  saturation  must  bo 
taken  into  account  when  determining  the  number  of  incoming  photons.  Ir. 
the  photomultiplier  tube  case,  a  single  photon  that  strikes  the  light  sens¬ 
itive  plate  g  js  rise  to  many  electrons  in  the  output.  By  scanning  the 
image  plane  with  a  photomultiplier  tube  it  is  possible  to  obtain  an  estimate 
of  the  number  of  photons  that  are  incident  upon  each  of  the  incremental 
cells.  For  a  simple  photon-electron  converter,  a  photon  gives  rise  to  a 
single  photoelectron  with  probability  tj.  The  quantity  tj  is  called  the 
quantum  efficiency.  The  photon -electron  converter  is  a  degenerate  case 
of  the  photomultiplier  tube  case  in  which  we  consider  only  the  first  stage 
of  the  photomultiplier  tube. 

Throughout  this  paper  we  will  assume  that  if  the  incident  energy  per 
unit  time  w  (or  mean  rate  of  signal  photons  s.)  from  an  optical  signal  is 
known,  the  signal  photons  statistics  are  Poisson  with  the  probability  th?‘; 
exactly  s^  signal  photons  will  impinge  upon  the  i**1  cell  in  time  t  given  by 


PtSj)  = 


(Ts.)Si  e  “T§i 

V 


(?•! 
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where  the  mean  rate  s.  a  v. /hv  .  Likewise,  if  the  incident  energy  per  unit 

it 

time  (or  mean  rate  of  noise  photons  m)  from  the  additive  noise  is  known, 
the  noise  photon  statistics  are  assumed  Poisson  with  the  probability  that 


tii 

exactly  noise  photons  will  impinge  upon  the  i  cell  in  time  t  given  by 


P^)  = 


n.  -Tn. 
,  — ,  1  i 

(Tnj)  e 

V 


(3) 


where  the  mean  rate  n.  =  q.  /hv  . 

i  i 

For  the  unknown  signal  case,  the  mean  rate  of  signal  photons  s^ 
incident  upon  the  i**1  cell  in  the  image  plane  is  a  random  variable.  The 
statistical  properties  of  s.  need  to  be  considered  since  they  influence  the 
statistical  properties  of  the  signal  photoelectrons  emitted  during  the  re¬ 
ception  of  a  signal.  For  this  case  the  prior  distribution  of  si  will  be  as¬ 
sumed  to  be  the  following  gamma  distribution  (Goodman.  1965;  Farrell, 


1966): 


u.  -1  -P.s. 

fi.  (ft.  a.)  1  e  1  1 
1  1  v 


1(8^  = 


rO^) 


s.  >>  0 

l  — 


(4) 


=  0, 


otherwise 


—  -  —  2  2. 

where  E(s.)  =  s.  =  u./fl.  and  var  (s.)  =  <r-  =  u. /j3.  !  For  this  unknow?* 

tilt  1  S£  1  1 


signal  case  the  probability  that  signal  photons  will  impinge  upon  the  i 
cell  in  time  t  is  given  by 

P(s.)  =  f  P(s.  /  s.)  f  (7.)  ds.  (5) 


.th 
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where 


p(  8.  /  S.) 
I  l 


8. 

(T8.)  1  e 


-T  8. 


8i  1 


(6) 


and  f(sj)  is  the  gamma  distribution  in  (4). 

For  the  case  of  known  signal  and  noise  both  the  signal  and  noise 
photons  obey  Poisson  statistics.  Because  of  the  additive  nature  of  the 
Poisson  distribution  the  total  photon  stream  has  a  Poisson  distribution. 

The  probability  that  y.  signal -plus -noise  photons  .will  impinge  upon  the 

.th  „  .  .  . 

cell  in  time  t  is  given  by 

_  _  y{  -[T(s.  +n.)] 

[T(.+n)]le  1  1 

p(yi/si*  »i>  s  -  (?) 


yi  * 


(T  y. )  1  e  "Tyi 


V 


For  the  case  where  the  mean  rate  of  signal  photons  7.  is  unknown 
and  the  mean  rate  of  noise  photons  is  known  the  probability  of  exactly 
y^  photon 8  impinging  upon  the  i4^  cell  in  time  t  ia  given  by 


00 


Pty/®,)  =  f  P  (y./s.  n  )f(s  )ds 

11  Jcs  1  1  9  1  1  1 


ft 


u. 

( _ - -  \  1 

'  -  1  Q  I 


T+X 


e~Tni  (nj  t) 

Hu.) 


yi 

S 

j=o 


- -—rlui  + j? -  j  . 

(y{  -j)t  j  t  PT  (T  +  Pj)j 


When  the  mean  rate  of  noise  photons  n.  is  unknown  we  will  assume  its 
prior  distribution  to  be  the  following  gamma  distribution: 
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f(n.) 


u.-l 

\  /I  -  V  1  11 

MVn.)  e 

l  l  i _ 

rr(u.) 


n.  >  0 

i  — 


(9) 


=  0, 


otherwise. 


When  the  mean  rates  of  noise  photon*  m  and  signal  photons  s^  are  un¬ 
known  (i.  e. .  y.  =  5^  +  n^  is  unknown)  we  will  assume  that  the  prior 
distribution  oi  the  sum  m  is  given  by  the  following  gamma  distribution 


u.-l 

u.(a.>.)  1  ’ai^i 
f(y.)  =  f  (i.  +  «.)3  111  C _ 


y.  >  0 

l- 


r  (u.) 


(:,j) 


o, 


otherwise. 
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MATRIX  REPRESENTATION 


In  this  paper  the  spatially  bounded  objects  are  divided  into  small 
cell*  over  which  the  intensity  is  approximately  constant.  If  these  cells 
are  made  small  enough,  they  may  represent  poir'  sources.  By  knowing 
the  point  spread  function  of  the  system,  the  image  can  be  approximated 
for  incoherent  light  by  superposition  of  the  point  spread  functions  result¬ 
ing  from  all  the  point  sources  of  the  disected  object.  Here  we  are  assuming 
spatial  invariance.  By  this  we  mean  that  the  object  is  small  enough  that 
points  of  a  given  intensity  located  anywhere  on  the  object  gives  rise  to 
the  same  point  spread  function  in  the  image  plane.  The  location  of  the 
point  spread  function  is  determined  by  the  position  of  the  point  source 


(0‘Neill,  1963). 


th 


The  number  of  photons  emitted  from  the  j  region  (point  source) 

of  the  object  plane  x.(  a.,  p  .)  can  be  represented  by  a  delta  function 

J  J  J 

* .»  P  .)  =  x.  6(  «-<*.,  p-p.)  where  a  and  p  are  the  coordinate 

J  J  J  J  J  J 

representation  in  the  object  plane.  Consider  the  system  impulse  response 

or  point  spread  function  a(  £  ,  r, )  where  £  and  £,  are  the  coordinate 

representation  in  the  image  plane.  The  optical  image  or  point  spread 

function  in  the  image  plane  due  to  the  single  point  source  x.  (  ct .,  p .) 

J  J  J 

is,  using  image  plane  coordinates,  x.(  £  £  .)  *  a(  £  ,  £  )  = 

J  J  J 

X.  6(£-e,  £-£)  *  a(£,£)  =  xa(£-£,  £-£.)  =  s.(£-£„  £-£.). 

J  J  J  JJJJJJ 

The  total  image  is  the  superposition  of  the  images  of  all  m  point  eouiccs. 


u 


That  is. 


m 


m 


6{e,u  =  s  xa(*-e  t-t =  s  s  d-e.,  c-u. 

j=i  J  J  J  j=i  J  J 


(id 


The  image  at  the  point  (  5  ,  £,  *)  will  be  due  to  the  images  of  the 
m  point  sources,  and  hence  we  have 


.  .  m  .  m 

a(l\tX)»*  s.(£X-£.,  r,l.ysTi  a(£l-S;,  £l-Ux;. 


,1  1  J 


j=l 


j=l 


y  j 


(12) 


Let  si  =  s(  I1,  £l)and  a.^  =  a(^X-4^,  £*-£.).  The  quantity 

s.  is  the  number  of  photons  due  to  the  optical  signal  incident  upon  the 
th 

i  cell  of  the  image  plane.  We  can  then  write 


m 

s.  =  2  a.  .x., 
1  j=l  lJJ 


(13) 


In  matrix  notation  this  can  be  written  as 


s  =  Ax 


(14) 


where 


;'n 


*  X 


S  s 


I 


1 


!  j  1  x  =  t  I  |  ,  and  A  » 
«  \  x 

i 


1  8n  * 


air  •  #  aim\ 


* 

\a 


.  •  t  •  a  1 
fcl  im ' 


If  additive  noise  is  present  at  the  image,  each  measurement  of  s. 


will  be  corrupted  by  an  additive  nri-e  element  n.;  hence,  we  have 


12 


y.  =  8.  +  n.  =  2  a..x.  +  n..  (15) 

i  i  1  j=l  J  1 

The  quantity  n.  is  the  number  of  photons  due  to  additive  noiae,  incident 
th 

upon  the  i  cell  of  the  image  plane.  In  matrix  notation  the  observed 
vector  is 


y  =  s  +  n  =  Ax+n 


(16) 


where 


i” i\ 


iyi\ 

(  .  i  .  i  • 

y  a  ;  •  j  and  n  =  \  .  i . 

’  y  ;  '  n  / 

i  % 

The  vector  y  will,  during  the  detection  process,  be  contaminated 
by  multiplicative  noise,  the  form  of  which  will  be  discussed  later. 

We  will  assume  throughout  this  paper  that  the  vector  representation. 


of  the  object  is  sufficiently  accurate  that  any  error  associated  with  it  is 
negligible. 
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MINIMUM  MEAN -SQUARE -ERR Oi^  ESTIMATE 

Introduction 

In  this  section  we  investiage  the  problem  of  obtaining  optimum 
estimates  of  an  optical  signal  or  object  distorted  by  diffraction,  additive 
background  noise,  and  multiplicative  noise  using  the  criterion  of  minimum 
mean-square  error. 

Consider  the  photon-stream  vector  y  impinging  upon  the  light 

sensitive  surface  in  the  image  plane  and  giving  rise  to  the  output  vector 

z.  The  quantity  y^  is  the  number  of  photons  due  to  the  optical  signal  and 

additive  noise  incident  upon  the  i^  cell  of  the  image  plane.  The  quantity 

z.  is  the  number  of  photoelectrons  due  to  the  optical  signal,  additive 

til 

noise,  and  multiplicative  noise,  being  emitted  from  the  i  *  cell  of  the 

image  plane.  The  quantity  z.  can  be  thought  of  as  the  number  of  counts 

(i.  e. ,  photoelectrons  for  the  photomultiplier  tube  case  and  developable 

grains  in  the  photographic  film  case)  in  the  output  of  the  detector.  The 

system  being  considered  is  illustrated  in  Figure  1. 

For  a  given  mean  rate  vector  n  we  will  assume  that  the  noise  n 

has  a  conditional  covariance  matrix  K  .  We  will  assume  that  for  a  g’"cn 

n 

moan  rate  vector  x  the  object  x  has  a  conditional  covariance  matrix 
K  and  that  x  and  n  are  conditionally  independent  (i.  c. ,  conditioned 
en  knowing  h  and  x  ).  Also,  we  will  assume  that  the  mean  rate  vector 


n 


(b) 


Figure  1.  Optical  system  configuration. 


u  has  mean  ri  and  a  covariance  matrix  k.  and  that  the  mean  rate 

vector  :c  has  a  mean  x  and  a  covariance  matrix  K-  and  that  it  and 

x 

n  are  independent.  and  x  as  used  here  represent  our  prior  infor¬ 

ms*  ion  about  the  mean  rate  of  the  object  rather  than  any  actual  statistical 
fluctuation  of  the  object.  Large  terms  in  indicate  small  prior  infor¬ 
mation  about  the  mean  rate  of  the  object  and  small  terms  imply  large 
prior  information. 


Photomultiplier  Tube  Detector 


For  the  case  where  the  photomultiplier  tube  acts  as  the  detector  of 
the  optical  image,  a  photon  k  gives  rise  to  electrons  in  the  output 
of  the  detector.  The  output  due  to  each  photon  is  assumed  to  be  inde¬ 
pendent  of  the  outputs  due  to  other  photons  but  identically  distributed 

2 

with  mean  b  and  variance  <r ,  .  From  the  photon  stream  incident,  upon 

b 

the  i**1  cell  of  the  image  plane  we  have 


z.  =  2  B,  . 
1  k=i  k 


/  1  V 
\  A  ‘  , 


The  element  z.  is  a  random  number  of  independent  random  variables 
(Porzen,  1962).  Because  of  the  Poisson  nature  of  the  photon  stream  tho 
conditional  mean  and  conditional  variance  of  y.  are  equal  (i.  e. , 
var  (y./  x,  n)  =  E(y. /x,  n)  ).  The  mean  and  variance  of  z.  are 


respectively 


.6 


E(z.)  =  E(y.)E/B)  =  (a.x  +  rb, 


(18) 


Var  (as.)  =  E(y.)Var(B)  +  Var(y.)E-(B) 

=  t?  T(a.x  +  n.)  +  b^(a.K  a.'  +  K  )  •*  T2b  la.K-a.'  +  K-  -  ) 
b  '  1  i#  ixi  n4nj'  i  x  i  n.n/ 

i  ^  ii 


=  (ov  +  fe2)  *(».*  +  n.)  +  r2b2(a.K-a.'  +  K  ) 
'  b  '  i  i  '  i  x  i  ff.ir 

i  i 


(19) 


th 

where  a^  is  defined  as  the  i  row  of  the  system  matrix  A  and  the 
prime  on  a^  denotes  transpose.  Thus, 


*-  3>««f  •  •  •  |  3*. 
i  ll  *  im 


*r 


"  a.-x«  +  . . .  a.  x  ■ 
u  1  im  m 


(zo: 


m 


The  quantities  x  and  n.  are  defined  as  E(3t)  and  E(n.)  respectively. 
Point  spread  functions  a„  for  various  optical  system  apertures  are 
derived  in  the  Appendix. 

For  the  vector  case,  the  expected  mean  of  z  is 


E(z)  =  rb(Ax+n) 


(?') 


and  the  covariance  matrix  of  z  is 


Cov  (z.,  z^)  =  E(zz* )  -  E(z)  E(z') 


where 


=  t  2b°  (Ajcx'A1  +Axn'  +  fix' A'  +  nil).  (Z 


E(z)  E(z«) 
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We  need  to  calculate  E(zz')  which  can  be  written  as 


E(zz  )  = 


E(z.  z  )  .  .  .  E(z  z  ) 
1  i'  1  m' 


E(z  z  )  .  .  E(z  z  ) 
mi  mm 


(•M) 


For  the  diagonal  terms  of  (24)  we  have 


E(z  z  )  =  E[E(z  z  /x,  n)] 

J  *  * 

=  T^b2(a.(K-+xx,)a.'  +  n.x'a.*  +  a.xn.  +  K-  -  +  n.n.l 
i  x  i  i  i  ii  nini  1  1 

+  b  [a.K  a.’  +  “  ]  +  T<r,~(a.x  +  n.) 
i  x  i  n.n^  b  i  v 

=  (cr  +  b  )T(a.x  +  n.)  +  T  b  [a.(K  -  -xx*)a,‘  +  n.x'a/  +  a.xn. 
b  '  '  i  r  li  x  i  ii  ii 


(25) 


+  K-  —  +  n.n.], 
n.n.  l  l 
i  l 

Now  consider  the  off-diagonal  elements  of  (24)  which  are 


E(z.z.)  =  E[E(z.z./x,n)l 

l  j  i  j  *  J 


=  T2b2[a.(K_  +  xx*)  a.'  +  a.xn.  +  n.x*a.'  +  K —  r  n.n.l  (26) 
1  i  x  j  i  j  l  j  n.n  i  jJ 

J 


+  b 


Zr  Z  tlS- 


[a.K  a.  +K  ]  +  p..ov  E\/(a.x  +  n.)(a.x  +  n.) 
1  ?  x  j  ninjJ  rij  b  vx  i  xn  j  j' 


where  p..  is  the  correlation  coefficient  of  z.  and  z..  We  will  rra  he  !'.ie 

i  J 

following  definition: 


Cz  =  tCz.  1  = 

z  2ij 


(p..E\/ (a.x  +  n.)(a.x  4-  n.)  ) 
^ij  i  i  j  j 


4 
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Thus  we  can  write 


E(zz')  =  b2T2[A(K-  +  xx*  )A  +  nx*A*  +  Axn*  +  K-  +  nn  ] 


n 


(23) 


+  b2[AK  A '  +  K  ]  +  <r  2  C 
x  n  b  s 


In  this  paper  we  will  assume  that  the  number  of  photoelectrons 
emitted  from  disjoint  regions  or  cells  of  the  detector  are  statistically 
independent  (Helstrom,  1964;  Farrell,  1966).  Hence,  the  multiplicative 
noise  will  be  assumed  to  be  uncorrelated.  For  uncorrelated  multi¬ 
plicative  noise  (i.  e.,  p  „  r.  0  (i  ^  j)  and  p„  =1)  we  have 

C  =  T(a.x  +  n.)6. . 
z  1  '  ij 

The  covariance  matrix  of  z  can  now  be  written  as 


T(a.x  +  ri.) 
l  l 


r(a  x  +  n  ) 
'  m  m 


(29) 


Cov  (z.,  z.)  =  o- 2  C  +  b2(AK  A*  +  K  ) 
i*  y  b  z  '  x  n' 


•§  9m 

=  to\  (a.x  +  n.)  6..  +  b  (AK  A*  +  K  ). 
b  i  r  lj  '  x  n ' 


(30) 


We  will  assume  for  convenience  that  the  number  of  photons  imping?^ 

upon  different  regions  or  cells  of  the  detector  are  statistically  indeperwlcr 

This  assumption  is  by  no  means  essential.  This  assumption  makes  (He 

matrix  sum  of  aK^A1  +  diagonal,  and  due  to  the  Pc' .3 son  nature 

of  the  photon  stream  this  sum  becomes  T  (a.X  +  fi.)  6...  The 

i  x  1J 


39 


covariance  matrix  of  z  is  then 


2  2  s  a 

Cov  (2.,  z.)  =  (<r,  +  b  )T(a.x  +  n.'  5. .  * 
v  3  b  1  1  ;  ij 


(31) 


Also, 


S(zz')  =  (<r  2  +  b2 
b 


)C  +  T2b2[A(K-+xx,)A* +nx'A' +A::a”'  +  K-  +  nn'l.  (3  ’) 
z  x  n  J 


We  now  war  )  find  the  linear  estimate  X  of  x  which  will  minimize 

Q 

/V 

the  mean-square  error  (MSE).  That  is,  find  x  to  minimize 

£  _  -  -  £  —  1 

e  s  E[(x-x)  (x-x)]  =  tr  E[(x-x)(x-x)  ]  (33) 

where  tr  denotes  the  trace  of  a  matrix. 

For  the  linear  estimate  of  x  we  write 

A 

x  =  Hz/t  +  v.  (34) 

To  simplify  the  mathematics  later  on  let 

a  - 

v  =  -Hb(Ax  +  n)  +  w.  (35) 

The  linear  estimate  of  x  is  then 

^  s  s 

x  =  H[z/t  -  b(Ax  +  n)]  +  u. 


(3") 
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We  need  to  find  a  matrix  H  (discrete  linear  filter)  and  a  vector  w  such 
that  the  mcan-.°quare  error  is  minimized.  Substituting  (36)  into  (33) 
and  expanding  yields 


1  i  w  p  n  ■  M.  i 

e  =  ( — 2”)  tr  E[H(s!z  -bTzx'A'-bTzn*  +  b  t  Axx  A 

T 


.2  1  2-i  2-i  2—i 

+  (T  wu  -t  wx  -t  xw  +  T  XX  )]  . 

Carrying  out  the  expectation  operation  we  obtain 


E(zz')  =  (<r^  +  b^)Cz  +  b^T^[  A(K_  +  xx'jA*  +  nx'A*  +  Axn1  +  K-  +  nn'](  (38) 


E(zx')  =  bT  (Axx*  +  nx')#  (S^) 

E(xx')  =  K-  +  xx1,  (4P> 

E(ojx')  =  ux',  (41) 

E(wz')  ss  bTfux'A1  +  un'),  (42) 


m * 
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-  2  ssi 

E(xz' )  -  bT  (;lx 


K-)  A 
x' 


,  2==' 
+  oT  xn  , 


(«) 


etc. 


Substituting  the  expected  values  of  (38)-(43)  into  (37)  yields 
e  =  -~  tr  [H[b2r  (AIC-a'  +  K-)  +  (cr^  +  b?)  CjH 
-bT  HAK--bT  K-A  H  +  T  cjw  -  t^wx-t^xw  (44) 

X  X 

2  ssi  1 
+T  (K-  +  Xx)j. 

Minimizing  (44)  with  respect  to  w  requires  that  w  =  x  and  minimizing 
with  respect  to  H  requires  that 


Kx  •  .  -1 

H=  ^-A  (gCz  +  K- +  AK-A  ) 


=  r-  [A'(gC  +K-)-1A  + K.“l]"1A,(gC  +  K-”1) 
d  1  z  n'  x  1  &  z  n  ' 


(45) 


,  ,  2  .  .  2.  .  2.  2 
where  g  =  (  <r,  +  b  )/ T  b  . 

D 

This  is  the  optimum  discrete  linear  filter  in  the  MSE  sense.  The 
optimum  linear  estimate  of  x  is  now 


=  Hfz/T.  j(Ax  +  n)]  +  x  =  H(z/r-bn)  +  (I-fcHAJx 

=  r[A’(gC  +  K-)-1A  +  V  “M^AfeC  +  K-)’l(8/T-.ba)  (46) 

b  z  n'  x  1  *  z  n 

+  [  A*  (gC  +  K-)-1A  +  K-."1]”1  K-"1  x. 
z  n  x  x 

A  r\  —  A 

If  the  mean  E(x)  of  an  estimate  x  equals  x,  the  r-ti^Ate  x  0" 

a 

x  i3  said  to  be  unbiased;  if  not,  the  difference  E(x)-x  is  defined  rs  the 


::3 


bias  of  the  estimate.  In  the  absence  of  a  priori  information  about  the 
estim.*  it  is  desirable  that  the  bias  of  the  estimate  be  small  or  aero. 

A, 

We  win  now  check  to  see  if  the  estimate  is  biased  where  x  is  some 

o  o 

/V 

actual  but  unknown  mean  rate  object  vector.  If  is  unbiased  then 

A 

E(x  )  =  E(x  )  =  i  .  For  the  case  being  considered 
o  o  o 


bias  = 


E[  H(z-bT(Ax  +  n))  +  Tx] 


x  T 
o 


=  H(bAx  T  -  bTAx)  +  xt-x  T  , 

o  o  (*■<) 

=  (bHA-I)  (xo-x)r 

A 

where  I  is  the  identity  matrix.  The  estimate  x^  is  unbiased  if  eit’  ov 


1)  x  =  x  or 

o 

2)  SHA  =  I. 

The  second  condition  implies  that  *  =  0  where  0  is  a  matrix  of 
zeros. 

The  covariance  matrix  K-  is  related  to  the  a  oriori  information 

x 

about  the  object.  If  the  elements  of  K  are  large  (particularly  las 

A 

diagonal  elements)  the  prior  information  about  the  signal  is  small.  Hence, 

for  large  a  priori  uncertainty  about  the  object,  K-  ***  0.  By 

X.  0  we  mean  that  the  elements  of  K_  *  arc  small  in  cemoariccn 
x  * 

with  A'(gC  +  K.)"lA. 

Z  U 

For  large  a  priori  uncertainity  we  can  write  II  ae  follows: 


<14 


K  =  -g-  [A'(eCz  +  K-)_lA]  *l  A'(gC2  +  K-)‘*.  (46) 

-  1 

Since  K.  «  £  for  large  a  priori  uncertainty,  the  estimate  for  this 
case  is  unbiasedt 

To  evaluate  the  optimum  estimation  or  restoration  procedure,  we 
must  find  the  MSE  for  the  actual  but  Unknown  object  vector  x^. 

Assuming  large  a  priori  uncertainty  we  have  for  our  minimum  Mf'l 
estimate  of  it 

o 


=  ~  [A(gCz  +  K-J^A]’1  A'(gCz  +  K-f1  (z/r-bn).  ('V, 

Given  that  the  object  vector  is  it  ,  the  MSE  is  given  by 

o 

A  A  j 

e  =  tr  E  [(x  -x  )(x-x  )  ],  (50) 

1  o  o'  o'  1 

Substituting  (49)  into  (50)  yields 

e=trf[A'(gC  +  K-)"lA]_1  A'[gC  +  K-]“X[gC  (x  )  +  K-  J 
l  z  n'  J  z  nJ  z  o' 

*[gCz  +  K-f1  A[A(gCz  +  K-f1  A]"1  ] 

where  (x  )  =  T  (a.x  +  n.)  6.. 

z  o  1  o  1  lj. 

Let  us  consider  a  simple  example  to  i  ivestigato  the  weighing 

due  to  it  (the  actual  object  vector).  Assume  that  we  have  a  slit  a-'erV.vre 
o 
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white  noise,  and  that  the  object  is  made  up  of  two  point  sources  which 
are  separated  by  the  Rayleigh  criterion  distance.  We  will  make  k 
measurements  at  the  peak  of  each  point  spread  function.  Also,  let 
D/vR  equal  one.  See  the  Appendix  for  clarification  of  these  assumptions. 
The  2k  x  2  matrix  A  becomes 


Substituting  this  matrix  into  (51)  and  carrying  out  the  indicated  operations 
yields 


e 


”2> J 


(52) 


where  k  =  0,  1,  2,  .  .  .  and  the  actual  object  vector  is  x  s  I  _ 


Oil 


02  j 
02 


The  weighting  due  to  x  is  not  significant  if  n.  +  n_  »  x  +  x 

o  lb  01  1 

2 

(small  signal-to-noise  ration  (SNR)  ),  ifT  becomes  large,  or  if  <rft  is 
large.  Figure  3  shows  how  the  error  varies  with  T  for  various  signal 
values,  a  single  noise  value,  and  k  =  1. 

+b  >  n  = 

In  general  as  T  becomes  very  large,  K-»gC  - - = -  (a.x+n.)6... 

n  *  Tb2  *  *  « 

Hence,  as  t  becomes  very  large  (51)  reduces  to 


e  =  tr  (a'l'a)'1,  (53) 

2 

For  the  special  case  of  white  noise  (i.  e. ,  K_  =  <r -  I  where  I  is  the 
r  n  n 

identity  matrix)  we  have 
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e  a  (r5  tr  (A*A)"*.  (54) 

-1 

The  factor  (A'A  )  in  (54)  can  be  thought  of  as  an  amplifier  of  the 
2 

noise  <r-  .  The  amplification  increases  with  decreased  aperture  size 
and/or  decreased  spacing  between  the  point  sources.  Let 

Gstr(A'A)*1.  (55) 

To  investigate  the  nature  of  G  let  us  consider  the  infinite  slit  aperture. 
The  point  spread  function  for  this  case  is 


D2  sin2^  HX>/vR)(£-b)1  (56? 

v2R2  T72p/vR  (£-h)]  2 

where  D  is  the  width  of  the  aperture,  h  is  the  distance  of  the  point  source 
from  the  origin,  and  R  is  the  distance  from  the  image  plane  r.rd  object 
plane  to  the  aperture  plane.  For  simplicity,  cor  eider  r.n  object 
consisting  of  two  point  sources  (one  at  the  origin  and  one  at  a  distance 
h  from  the  origin,  see  Figure  4).  The  A  matrix  becomes 


(D2/v2R2)  sine2  fD/vRJd^h)]  (D2/v2R2)  sinc2[D$  /vR] ' 
(D2/v2R2)  sine2  £>/vR)(£  -h)]  (D2/v2R2)  s--.c2fP^/:«T> ] 


where 


and 


^2  are  the  measurement  positions  in  the  image  pi?.*  e 
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and  sine  x  =  sin  irx/  ttx.  If  we  make  one  measurement  at  the  peak  of 
each  point  spread  function  and  let  vR  equal  one,  we  have 


A  = 


D 


D2  sinc2(Dh) 


D  sine  (Dh) 
2 


(3f 


The  amplification  factor  G  then  becomes 


G  =  tr  (A’A)"1 


2  |~1  +  sine4  (Dh)l 
D4  [l  -sine4  (Dh)]2 


Figure  5  shows  how  the  amplification  factor  G  varies  with  aperture 
width  D  and  separation  of  point  sources  h.  The  abrupt  increase,  cf 
log^G  occurs  when  the  size  of  the  object  (separation  of  the  two  point 
sources)  becomes  approximately  the  size  of  the  point  spread  function 
(see  Harris  and  Rushforth  (1966)). 

Special  Cases 

Because  of  the  complexity  of  the  general  estimate  in  (46)  we  will 
consider  various  special  cases  in  order  to  gain  a  better  understanding 
of  the  estimation  procedure. 
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Prior  information  dependence 

Z  Z  2 

Let  C  =  <r  I,  K.  =  <r-  ,  and  K-  =  <r-  I.  Our  estimate  of 
7.  z  x  x  n  u  n 

x  then  becomes 


A 

X  = 


A«A 


g<r  +  o’. 
^  n 


a'a 


2  2 
g<r  +  <r- 
z 


n 


2 

O’  - 

x  J 


O’- 

X 


-l 


-1 


A'(z-bTn) 

2  ,  2 
go-  +  <r- 
z  n 


T  * 


(60) 


2  2  2 

For  large  a  priori  uncertainty,  o’-  »  g  <rz  +  o’-  »  we  have 


a  i  i-ll  = 

x*  —  (A  A)  *A  (z-brn) 


(6i) 


which  indicates  that  we  ignore  the  a  priori  mean  x.  For  large  prior 

2  2  2 

information,  go-  +  o’  _  »o-„  ,we  have 

’  °  z  n  x 


0  « 

x  «  x 


(62) 


which  means  that  our  prior  mean  is  very  reliable  and  that  we  ler'rn  ve‘  v 


little  from  our  experiment 
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Large  a  priori  uncertainty 

Consider  the  case  of  large  a  priori  uncertainty  (i.c. ,  0). 

Our  estimate  of  x  becomes 

*  =  -7  [A}gCz  +  K-J^A]"1  A'(gCz  +  K-)-1(z-bTn)  (63) 

and  the  MSE  is 

e  =  tr[  [A'(gCz  +  K-)”*A]  '*  a'(eC2  +  K-f'fgC^)  +  K-)  (64) 

•<gc2  +  k-)"1  a[a'(ec2  +  K-)"^]"1  j. 

Matrix  A  with  inverse.  If  the  system  matrix  A  is  square  and 
invertible  (63)  reduces  to 

A  , 

x  = A  (z/bT-n)  (63) 

and  (64)  reduces  to 


e  =  tr  [A"1(gCz(xo)  +  K-Ja'"1].  (66) 

The  above  estimate  of  x  is  an  intuitive  estimate  since  all  we  do  is 
divide  out  the  multiplicative  effect,  subtract  the  noise,  and  then  pas 3 
this  result  through  an  inverse  filter.  This  procedure  is  illustrated  in 
Figure  6. 
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<•  X 


X 


< 
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Small  signal-to-noiso  ratio.  For  small  signal-to-noise  ratio 
(SNR)  (i.  e. ,  ajc  «ft  )  the  linear  filter  H  becomes 


H  =  L  [A^grn.S..  +  K-)  *A]  *A  (gTn.6..  +  K-) 
b  1  '6  1  ij  n‘  1  i  ij  n 

and  the  MSE  becomes 


(6T) 


-l.,-l 


e  =  trfA'CgrnA^  +  K-)“  A] 


(68) 


The  MSE  error  becomes  independent  of  for  small  SNR  (see  Figure 
3). 


If  the  noise  is  independent  of  i  (uniform  noise)  then  n.6„  =  NI 

where  ft.  =  N  for  all  i.  For  uniform,  white  noise  we  have 
i 


(69) 

(?"• 

Perfect  detector.  For  a  perfect  detector,  B  in  (17)  is  a  constant 

and  hence  each  photon  gives  rise  to  exactly  the  same  number,  b,  of 

2 

photoclsctrons  (i.  e, ,  =  0).  For  this  case 


H  =  i  (A'AfV 

b 


and 


e  =  (grN+tr-)  tr  (A* A) 
n 


H  =  i.  [A'(T(a.x  +  n.)<$..  +  t^K-)"*A]  A*(T(a.x  + 
b  1  i  r  ij  n  1  i 


n.)6. . 
i  iJ 


+  r^K-)"1 
n1 


/  *-■  !  % 
(  <•/ 
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and 


e  =(l/T2)trf[A,(T(aJ  +  H.)6..+T2K.)'lAl'1A,[T(a.x+5.)6..+T2Kj'1 
L  i  t  ij  n  J  ‘■  li  ij  nJ 


E:ccept  for  the  constant  b  in  the  expression  for  H  these  are  the  same 
results  that  one  would  obtain  if  he  were  to  count  the  photons  ir-ident 
upon  the  image  plane  and  in  turn  find  the  minimum  MSE  estimate  of  the 
mean  rate  of  signal  photons  emitted  from  the  optical  object  (i.  e. , 
minimum  MSE  for  no  multiplicative  or  detector  noise). 

Large  r  dditive  noise  covariance  matrix.  For  large  K_  (i.  e. , 

K-  »gC  )  we  can  write 

Ii  °  z 


H  = 


D 


(A,K-**1A)~1A,K-**1 
n  n 


<v:> 


.ed 


e  -  tr  (A'k-^A)”1. 
n 

For  ihir  assumption  the  minimum  MSE  become 3  independent  of  the 


(7  4' 


multiplier '  ive  noise. 
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Photon-electron  converter 

The  photon- electron  converter  is  a  degenerate  case  of  the  photo¬ 
multiplier  tube  case  in  which  we  consider  only  its  first  stage.  This  is 
the  detector  model  that  Helstrom  (1964  usc3  in  some  of  his  work  on 
optical  signal  detection.  We  are  assuming  that  the  stream  of  photons 
v^  that  impinge  upon  the  i^  cell  has  a  conditional  Poicson  distribution 

with  a  conditional  mean  rate  of  y.  =  (a  x  +  n.);  hence,  E(y./x,  n)  = 

ill  1 

var  (y  lx,  n)  =  y..  In  the  photo  -electron  converter  each  photon  gives 

rise  to  an  emitted  electron  with  probability  rj.  The  stream  of  photo- 

electrons  z.  being  emitted  from  the  1^  cell,  therefore,  has  a  condition-- .1 

Poisson  distribution  with  a  conditional  mean  rate  of  n(a.x  +  n.)  =  ri  v. 

i  i  i 

since  a  Poisson  process  is  preserved  unc  r  random  selection  (Parzen, 

1962).  Hence,  var  (z./Jt,  n)  -  E(z./x,n)  =  qy..  The  incoming  photons 

and  emitted  photo  el  ectrons  are  related  by  their  means: 

rjE(y^)  =  rj  E(a^x  +  n.)  =  E(z.).  Since  z  is  a  Poisson  process, 

var(z^/x,n)  =  E(z./x,n)  =  rj  E(y./?.,  n).  From  pr  cviou.3  results  end 

using  the  Irbisscn distribution  properties,  we  have  E(z^/*c,  n)  = 

22 

bE(y.  /x,  n)  and  var(z./x,  n)  =  (<r  .  u  +  b  )  E(y.  /x,  n).  Hence,  we  can 
i  i  b  i 

use  the  previously  obtained  results  and  apply  them  to  the  photon-electron 

converter  case.  This  is  done  by  replacing  b  by  rj  and  replacing 
2  2 

0*  +b  by  t|  .  Hence,  for  the  photon- electron  converter  (assuming 
large  a  priori  uncertainty)  we  have 
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H  =  i  [A'(Cz/11  +  T*Rn  lA]“l  a'(C  Jr\  +  t2K-)_1  (75) 

and 

e  =(l/T2)trf[A'(C  /q  +T2 K-)1A)~1a\c  /r,  +T2K-f1(C  (x  )/n+T2K~) 
ll  z  n  J  z  1  n '  zx  a  *  rv 

•  (Cz/n  +  AjrVtVl  +  TZK-)_lA]  *l  j  .  (76; 

As  t]  approaches  unity  these  results  approach  those  for  the  case  of  no 
multiplicative  noise.  1  te  case  of  q  =1  (perrcct  detectcr)  implies 
that  every  photon  that  impinges  upon  the  detector  causes  an  electron  to 
be  emitted  with  probability  one.  As  q  approaches  zero,  the  error 
becomes  extremely  large. 

Estimation  of  mean  rate  x  from  observations  of  z 

In  many  realistic  cases  where  we  have  large  number  of  covr/.s, 
equipment  capabilities  allow  us  to  measure  only  intensity  or  me  an 
rate  of  photoelectrons  z.  (We  are  assuming  that  the  sample  mean 
equals  z  by  the  law  of  large  numbers. )  For  this  case  we  want  to  find  the 

A 

linear  estimate  x  of  x  from  observations  of  z  which  will  minimize  the 

A 

mean-square  error;  that  is,  find  x  to  minimise 

A  /N  | 

e  =  tr  E[(x-x)(x-x)  ].  (7"’ 

A 

From  the  estimate  x  we  can  obtain  an  estimate  of  the  irten  lty  vector 

A 

by  multiplying  x  by  h  v  .  From  previous  results  z  =  by.  We  also  have 


D8 


y  =  Ax  +  n, 

(78) 

E(y)  =  y  =  Ax  +  n, 

(7?) 

r(z)  =  z  =  b(Ax  +  n). 

(80) 

Cov(y.,  y.)  =  (AK-A»  +  K-), 

*  J  ** 

(81) 

Cov  (z.,  z.)  =  b2(AK_A»  +  K-). 
i  J  x  n 

00 

Fcr  the  linear  estimate  of  x  we  can  write 

a  - 

x  =  H[z  -  b(Ax  +  5)]  +  w.  (?') 

A 

Substituting  this  expression  of  x  into  (77)  and  carrying  out  the  eroectntin 
operation,  we  obtain 

H  =  I.  K-A*(K-  +  AK^a’)"1  =  Ija'K-^A  +  K-**lf1A,K-"1.  (84) 

b  x  n  x  bn  x  n 

Hence,  our  estimate  of  x  is 


^  i  t  •!  «}  s  «i  •  t  «•  1  —l  —x  —1" 

x  -  I  (AK-  A  +  K-  )  A  K-  (z-bn)  +  (A  K-  A  +  K-  )  K-  >:  (35) 
b  **  ~ 


x 


n 


n 


x  >: 


For  0  (large  a  priori  uncertainty)  the  minimum  MSE  is 


e  =  tr(A';C  *  *A)~\ 


(88^ 
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This  is  the  same  result  that  wan  obtained  in  (53)  where  we  let  t  become 
large.  Except  for  a  constant  b,  these  results  are  the  same  as  those 
for  the  case  of  measuring  image  intensity,  and  fiom  this  estimating 
the  intensity  of  the  object  vector  in  the  absence  oi'  any  multiplicative 
noise  (Rushforth,  1965). 


Estimation  of  mean  rate  9  from  z 

The  purpose  of  considering  the  estimate  of  y  from  measurements 
of  z  is  to  use  the  results  in  a  later  section  for  the  detection  of  unknown 
signals. 

A 

We  want  to  find  the  linear  estimate  y  of  y  which  will  minimize  the 
mean-square  error 


e  =  trE  [(y-y)  (y-y)'] 


(87) 


A  -  .  A  ^ 

where  y  =  H(  z  f  T  -by)  +  y.  When  y  is  substituted  into  (87) 
and  the  expectation  operation  carried  out,  we  obtain 


e  =  tr  [H(b2K-  +  (<r  2  +  b2)  CjrZ) 


H  -bHK- 


fcK-H* 

y 


K.J 

y 


(88) 


The  H  that  minimizes  (88)  is 


(8Ci  +  K^ 


-1 


[ 


2  2  2  2 

where  g  =  (  tr  ^  +  b  )/  t  b  .  Hence,  the  minimum  M3E  estim  ate  of 
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y  is 


y  =  brK-[(o-^  +  b2)  +  b2f2K-j  ^(z-bTy)  +  y. 


(90) 


For  the  photcn-electron  converter  b  =  t)  and  (o-^  +b  )  =  r|  ;  hence. 


y  *  (Cz  +  1  (z-nTy)  +  y- 


(91) 


For  K_  sc-  6..  we  can  write  the  estimate  of  y.  as 

y  13  1 


V 


Tc  -  (s.-riTy.) 
y.  l  i 

i _ 

=  22 
Ty.  +  T  <r- 

7i  y 

•'l 


+  y;- 


(92) 


For  the  statistical  model  we  have  been  assuming  y^  =  u./flr  &■; •-*' 

2  2 

o’.  =  u./  a.  (see  (7)  and  (10)  );  hence,  after  rubstituting  these 

^  1  i 

values  into  (91)  and  rearranging  we  obtain 


z.  +  u. 
C  _  l  i 

yi  "  nT  +tti 


(93) 


Optimum  Sampling  Scheme 

When  the  additive  noise  covariance  matrix  is  large  or  fo-  large  r 
the  error  expression  reduces  to  that  of  the  additive-noire-on’y  case: 


c 


<\ 


tr  (A'K-  *  lA)"  1 
n 


(94) 


The  problem  we  now  face  is  to  determine  an  optimum  sampling 
procedure  which  will  minimize  the  above  error  expression.  We  are  able 
to  vary  the  matrix  A  by  varying  our  sampling  positions.  Thus,  for 
some  optimum  sampling  positions  the  MSE  will  be  minimized.  The 
optimum  sampling  procedure  for  this  case  also  applies  to  the  case  *f 
small  SNR  with  uniform  noise. 

White  noise 


For  v/hite  noise  K_  =  or-  I  and 
n  n 


e  =  or 3  tr  (A*A) 


iW:>) 


Single  point  source.  Consider  an  infinite  slit  aperture  which  is 
equivalent  to  reducing  the  optical  problem  to  one  dimension.  The 
point  spread  function  for  this  case  is 


D2/v2R2]slnc2  [  JL  (£-h)] 
vR 


(90 


(see  Figure  4  and  the  Appendix).  Consider  the  normalised  case  where 
the  point  spread  function  is 


42 


•  2//-  li  sin  [ir(S-h)] 

sine  (£ -h)  a  ~ — **-“ — a-. 

*  (4-h) 


(97) 


Our  objective  is  to  minimize  tr(A'A)  ^  by  properly  locating  our 
samples.  Tor  a  single  point  source  located  at  the  origin  in  the  object 
plane  and  making  k  samples  in  the  image  plane,  the  matrix  A  become  r 
a  vector  of  the  form 

2  . 


A  = 


sine  c, 


*.  2 
sine  | 


/  n  p  \ 
1  /  C  f 


where  |  ,  %  ,  ...,  §  represent  the  positions  in  the  image  plane 

i  W  K  _ 

k  A. 

where  the  samples  are  taken.  A'A  is  then  T,  sine  " £ ,  and 
-1  ~1  k  4 

tr  (A'A)  =  (A'A)  =  1/2  sine  Since  we  are  free  to 

i=l  1 

make  the  measurements  anywhere  in  the  image  plane,  we  want  to  mak^ 

k  4 

the  measurements  such  that  S  sine  £.  is  a  maximum.  It  is  obvious 

i=l  1 

that  we  want  to  make  all  of  the  measurements  at  the  origin  (i.  e. , 

G  .  =  0,  i  =  1,  2,  ... ,  k).  Hence,  tr(A'A)  *  -  1/k.  Then  for  the 
minimum  error  expression  we  have 


e  .  =  cr  tr  (A'A)”*  =  cr  -/k. 

min  n  '  n 


i99; 


The  error  is  inversely  proportional  to  the  number  o',  •.neacttreiaont.3. 
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As  k  approaches  infinity  ,  &  .  approach's  zero,  The  above  sampling 

min 

scheme  applies  to  all  well-behaved  point  spread  functions  for  the  cose 
of  white  noise. 

m  point  sources  and  l  measurements.  Assume  that  our  object 

consists  of  m  point  sources  and  that  they  are  separated  by  the  Re  yield'll 

criterion  distance.  By  the  Rayleigh  criterion  distance,  we  mean  that 

the  maximum  of  the  diffraction  pattern  of  one  point  source  overlaps  the 

first  minima  of  the  diffraction  patterns  due  to  adjacent  point  sources. 

2  -1 

We  want  to  minimize  e  =  «r  -  tr  (A'A)  when  we  ha^e  m  point  sources 

n 

and  make  &  measurements. 

Slit  aperture.  For  a  slit  aperture  the  general  expression  for  the 
poir.t  spread  function  is 


°  sine2  [JL. ,  (g-J)]. 

vR 


22 
v  R 


( 100) 


To  simplify  the  mathematics  we  assume  that  the  point  spread 

function  is  normalized  with  D/v  R  set  equal  to  unity.  The  poir.t  spread 

2 

function  is  then  sine  (  £  -j)  v/here  j  =  ...-3,  -2,  -1,0,  1,2,3,  .... 
"'Y ''  normalized  system  matrix  for  the  slit  aperture  case  becomes 


A  = 


!  sinc2(^-l)  ,  ,  .  sinc2(£j-m) 


*  2  *  2 
sine  -1)  .  .  sine  (£t-m)J 


( \  0  r 


44 


Figure  7.  Optical  system  configuration  for  an  infinite  slit  aperture 

and  point  sources  separated  by  the  Rayleigh  criterion  distance. 
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The  optimum  sampling  procedure  for  this  case  (see  Appendix) 
is  to  sample  t  /m  times  at  the  peak  of  each  point  spread  function.  If 
£■  /m  is  not  an  integer  (i.  e. ,  i/m  =  k  +  n/m)  the  optimum  procedure 
is  to  make  k  measurements  at  the  peak  of  each  point  spread  function  and 
one  additional  measurement  at  the  peak  of  any  n  of  the  m  point  spread 
tions.  If  t,  /m  =  k  (integer) 

e  =  cr.2  tr  (a'a)"1  =  cr  2  m/k  (101) 

n  n 


where  m  is  the  number  of  point  sources  and  k  in  Ilia  r ember  of 
measurements  per  point  source. 

Rectangular  aperture.  For  the  retangular  aperture  the  general 
expression  for  the  point  spread  function  is 


2,  2 

a  b 


2  . 


0SinC  t-^rM>]sinc  [-— U'-i)] 


v  R 


vR 


(1C 


Normalizing  this  point  spread  function  by  letting  a/i’R  =  b/v  R  =  1 

2  2. 

melds  sine  (  £  -j)  sine  (  C-i)  where  1,  j  =  . .  .  -2,  - 1,  0,  1,  2,  3,  ...  . 

The  optimum  sampling  procedure  and  MSE  for  this  case  are  the 
same  a3  for  the  slit  aperture  case. 

For  point  source  spacings  less  than  the  Raleigh  cr  iterion  dimr.n.nc 
the  optimum  sampling  procedure  becomes  verv  complicated  and  will 
not  be  considered  here.  Harris  a-id  Rushforth  (i9o61  v  ■’•k  ort  r ,c»r>e 
c  iccific  examples  of  this  case. 
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Colored  noise 

Because  of  the  complexity  of  the  colored  noise  case  only  a  single 
point  source  and  two  measurements  in  the  image  plane  will  be  considered. 
We  will  assume  a  noise  covariance  matrix  of  the  form 


§M 


K.  =  <r-2 
n  a 


i  e-c  M 

e*cW  , 

where  Jd|  is  the  distance  between  the  measurement  positions  (i.  s., 

Jd|  ”  |£.  “  £-j )  and  c  is  a  correlation  constant.  The  inverse  of  the 

1  M 


(104) 


K_  matrix  is 
n 


-c|dh 


1  -e 

-i  _i_  \  .,-c!dl  i 

*  \}  l-a-2eldl  • 

n 


r  i 


For  the  present  case  of  two  measurements  and  one  point  source  the 
system  matrix  A  is  a  two-element  vector.  If  we  assume  that  the  point 
source  is  located  at  the  origin  of  the  object  plane,  the  normalized 
system  matrix  A  becomes 


A  =  i 


/  sinc^j  \ 

l  etac\ ) 

(!"S 

where  and  are  the  measurement  positions  in  the  image  plane. 

As  c  approaches  infinity,  K.  approaches  o 1?)  =  <r-^  I  which 

n  n  0 1  n 

implies  that  the  additive  noise  i3  uncorrelatcd  at  any  two  positions  or 
•that  we  have  white  noise.  As  c  approaches  zero,  K-  approaches 


t. 
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Case  2.  Let  £^  =  -£^  which  implies  that 


e  a 


1  +  e“2c  jN 

4 

2  sine  § 


( i  C8) 


Case  3.  Let  £  =  0  and  vary  £  which  implies 

fa  1 


that 


e  a 


1  -  e-2c  IM 


i  -  2  sinc2£1  e“CIN 


(109) 


Case  4.  Let  £,  approach  infinity  and  vary  £  .  This  implies 
c  1 

that  e  has  a  minimum  of  unity  for  =0. 

Cases  2  and  3  need  to  be  investigated  and  compared  with  cases  1 
and  4.  Cases  2  and  3  were  programmed  and  the  error  c.  rmined  for 
varying  £  and  also  for  different  values  of  c.  Cas  .-*.3  2  and  3  were 
found  to  be  always  as  good  or  better  than  cases  1  and  4.  The  value  of 
c  determines  which  of  cases  2  and  3  gives  the  minimum  error.  For 
c  less  than  about  1.  5,  case  2  gives  the  minimum  error;  and  for  c 
larger  than  1.5,  case  3  gives  the  minimum  error.  Figure  9  shows 
the  minimum  error  possible  versus  c.  Figures  10  and  11  show  the 
error  obtained  for  varying  for  cases  2  and  3. 

In  Figure  10,  for  case  3,  note  that  as  the  value  of  c  becomes  small 
the  error  becomes  small  and  in  fact  approaches  zero  as  c  ppproacheg 
zero.  When  c  is  zero  the  noise  is  constant  from  one  nc:  iti-  a  to 


another.  Hence,  if  the  noise  plus  signal  i3  measured  at  the  peal;  of 
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Figure  11.  MSE  e  versus  distance  |d|  between  the  two  measurement 
positions  of  case  2. 


| 


■A 
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the  point  spread  function  and  the  noise  is  measured  at  the  null 
positions,  the  noise  can  be  subtracted  from  the  measurement  at  the 
peak  and  only  the  true  signal  will  remain  since  the  noise  for  both 
positions  is  the  same.  Note,  however,  that  in  this  latter  case  the 
multiplicative  noise  must  be  considered. 
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OTHER  ESTIMATES 

Introduction 

In  this  section  we  will  investigate  three  other  types  of  estimates: 
the  Bayes'  estimate,  the  maximum  likehihood  estimate,  and  the  maximum 
a  posterior)  estimate.  Throughout  this  section  and  the  remaining  sections 
we  will  consider  only  the  photon -electron  converter  detector  with  a 
quantum  efficiency  of  q.  The  incoming  photons  due  to  both  known  signals 
and  known  noise  will  be  assumed  to  be  Poisson  distributed.  Unless  other¬ 
wise  stated,  whenever  an  a  priori  density  function  is  needed  for  the  mrn 
rate  of  incident  photons  we  will  assume  it  to  be  a  gamma  distribution 
with  known  parameters  (see  Statistical  Model  section).  Our  motivation 
for  using  this  distribution  is  due  to  its  unique  characteristic  of  generr* ;ng 
another  gamma  distribution  as  an  a  posteriori  density  function  when 
combined  with  a  conditional  Poisson  distribution  ir.  the  Bayes*  fortrml 
The  gamma  distribution  is  also  physically  reasonable  (Goodman,  1965; 
Farrell,  1966). 

We  have  defined  y^  to  be  the  mean  number  of  phetens  that  are  r\- 
th 

cident  upon  the  i  cell  of  the  image  plane  per  unit  time  and  qy^  as  the 
mean  number  of  photoelectrons  that  are  emitted  frem  the  light  cenritive 
i**1  cell  per  unit  time. 

Since  the  stream  of  incoming  photons  are  Poisson  distributed  for 
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I 

| 


fl 


mown  signal  and  noise,  the  probability  that  photons  strike  tho  coll 
in  time  t  ,  given  y.,  is 


p  (y^y) 


(y,T)yi 


e. 


-yi  T 


y.  J 

i 


U10' 


Likewise,  the  probability  that  z^  photoelectrons  are  emitted  from  tho 


.tli 

1 


cell  in  time  t  ,  given  y.,  is 


p(zt/y) 


(tlTVi)Zi  -9irr\ 


(ill) 


If  we  assume  that  the  photoelectrons  or  "counts"  are  independent  for 
each  region  (i.  e. ,  the  number  of  electrons  emitted  in  each  region  is 
independent  of  those  emitted  from  ether  regions  or  cells)  we  cm  write 


P(a/y)  «  p  (z | ,  z,.  ....  2  /y) 

m  (nTy.)Zi  e  ’T1'ryi  (1 

=  I  I  p(z./y)  =  JJ  - l~Y~t -  * 

1=1  i=l  i  ‘ 

This  is  the  probability  that  electrons  are  emitted  from  cell  1,  z.y 

electrons  are  emitted  from  cel!  2, . . . ,  and  z  electrons  are  emitted 

m 

from  cell  m  all  in  time  t.  We  will  also  assume  that  the  mean  rate  of 
noise  photons  is  fixed  and  known  when  the  noise  is  considered  indmm 
cntly  of  the  signal. 

The  estimates  of  obtained  in  this  section  will  be  used  for  the 
estimator  correlator  detector  in  the  section  on  fixed- sample  detection- 


i 
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Dayes'  Estimate 

By  definition,  thi  Bayes'  estimate  is  the  y  (z)  which  minimizes  the 
average  risk.  When  an  incorrect  decision  or  estimate  is  made  a  loss  or 

A 

a  cost  results.  If  y  is  the  true  state  of  nature  and  we  say  that  >  i3  the 

A 

state  of  nature,  we  lose  an  amount  c(y,  y).  The  information  about  the 
experiment  is  contained  in  the  conditional  density  function  p(z/yi»  which 
ij  assumed  known  for  each  y.  With  the  a  priori  density  function  f(y),  the 

A 

loss  function  c(y,  y),  and  the  conditional  density  function  p(z/y)  for  each 

A 

y»  the  estimate  y  can  be  found  which  minimizes  the  average  lose. 

The  mathematical  form  of  the  Bayes'  estimate  is  obtained  as  follows. 

If  y  is  the  true  state  of  nature  and  we  observe  z,  then  we  lose  an  amount 
__  £  £ 

c[y,  y(z)]  by  using  the  estimate  y.  When  y  is  the  true  state  of  nature 
the  risk  is  the  average  of  this  loss  fwicticn  over  all  possible  outcomes  of 
the  experiment.  That  is. 


**  n  CO  n  S* 

P(y»y)=  /c.  I  c[y,  y(z)]p(z/y)  dz. 

-eo  v 


The  risk  depends  on  both  the  state  of  nature  y  and  on  the  estimate  y. 

The  average  ripk  is  the  average  of  p(y»y)  over  all  possible  state : 
of  nature.  That  is, 

p(?)  =  Cf  p(y  ?)f(y)c’y  =  [•*,[ -ty.y(^)]p(z/y)f(y)'^y- 

Using  Bayes*  Theorem  we  obtain 


i 
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A 

c[  y>  y(z)]f{y/z)p{z)dydz 


A 

lr'4f>  (y)dz 

Z 


(11?: 


c  r 00  r  -  ^ 

where  p  (y)  =  /#••/  c[  y,  y(z)] £{y/z)dy  is  the  conditional  risk.  Since 

A 

p  (s)  is  non-nega  .ve  and  independent  of  y,  we  need  only  minimize  p  (y) 

z 


for  each  z  in  order  to  minimize  p-(y). 

✓N 

It  is  now  necessary  to  specify  a  loss  function  c(y,  y).  We  will  con- 

_  C  2 

sider  the  quadratic  loss  function  K(y-y)  where  II  is?  a  positive  constant. 
Using  this  loss  function  our  problem  reduces  to  minimizing 


?z (y)  =  K  Jl,.  f'(y-y)Z{(y/z)dy 

~Q0 

for  each  z  by  choosing  the  appropriate  estimate  y.  This  minimization 
is  accomplished  by  differentiating  p^  (y)  with  respect  to  y  and  equating 
the  result  to  zero.  That  is. 


4r  f.(>l  =-2K  (y-y)f(y/i)dy  =  0. 

Sy  '  J 


The  solution  of  this  equation  for  y  yields  the  Bayes*  estimate 

y=  /!.* J  yf(y/z)dy.  (U3) 


I’cnce.  for  a  quadratic  loss  function  the  Bayes*  estimate  y  is  the  mean 
of  the  a  posteriori  distribution  f(y/z). 

We  will  now  find  the  Bayes*  estimate  of  y.  which  if  the  average 

i 

number  cf  signal -plus-noise  photcr  *vhich  are  incident  ir?on  t'\e  \  c*:  V- 
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of  the  image  plane  per  unit  time.  Assume  that  the  a  prion  density  function 
of  the  mean  rate  of  signal -plus -noise  photons  y  is 


,  ..  .Ui-1  -or.y. 

m  m  ctilattYi)  1  e  i  i 

i(y)  =  TT  f{y.)  =  TT 

i=l  i=l 


i  “  i'i' 

T  (u.) 


y.  >  0 

l— 


(119) 


=  0, 


otherwise 


where  rn  is  the  number  of  cells  in  the  image  plane.  The  conditional  Poisson 
distribution  of  the  out).  it  photoelectrons  z  is 


m 


p'^/y)  =  TT  p(s-/y)  =  FT 

i=l  i=l 


m  e-riTyi 


(hT  yi)Zi 


p*  I 
w ,  • 

1 


(120) 

th 


where  z.  is  the  number  of  photoelectrons  or  "counts"  emitted  frc:n  the  i 
cell  during  time  t.  The  a  posteriori  density  function  of  y.  f(y/s).  is  given 
by  Bales’  formula 


mix) 

p(z) 


(121) 


where 


00 

PU)  *  fl"  ^  p(z/y)f(y)dy 


**00 

m  o.Ui(tiT)Zi  r  (z.+u.) 

=  U  z.i  r  (u4)(ff,+  nT)Bi+>1i 


1”  1 


(!?") 


The  a  posteriori  density  function  becomes 


m  (« .+f|T)[y .(o.+i1T)]*i+ui*1c'^i+11'  5'1 

f (y/«)-TT  - 1 - 

i=l  r  '“i+ui» 


(i?n 
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The  Bayes  estimate  (conditional  mean  of  y^)  is  then 


y*  z.+u. 

-  _  i  i 

yi  *  o.+n  t 


(124) 


For  this  case  the  conditional  variance  of  y.  is 

'i 


var  (y ./z)  = 


z.+  u. 

i  i 


+nT) 


(125) 


As  t  becomes  large,E(y^/z)  approaches  z./iyr  and  var  (y./z)  approaches 
zero.  Hence,  as  t  becomes  1*  ge  the  estimate  y^  approaches  the  true 
value  of  y.  since  the  variance  approaches  zero.  For  multiple  sampling 
the  Bayes'  estimate  becomes 


S  z.  +  u. 

*(k)s*ks  i*1  1  1 

yiw  i  (a.+knT) 


(126) 


where  "ite  superscript  k  represents  the  number  of  samples. 

Now  assume  that  the  a  priori  density  function  of  the  mean  rate  of 
signal  photons  s  (s  *  Ax)  is 


Q  I  Q  "  iU»  *  1  S , 

m  m  p  (p.s.)  i  eii 

«5, .  n  «v  •  rr  ^-^5 - -  •.  *• 


otherwi  se. 


(127) 


The  mean  rate  of  noise  photons  are  now  assumed  to  be  fixed  and  known. 


The  distribution  of  the  output  photoelectrons  z.  is 
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p(Zj)  =  J  p(z./s)  l(s)do 
o 

e"niT'T{r|T)‘,i0iUi  z<  /  z  \  nj  r(  5.+ u.-£ ) 


( 128) 


z.j  r(u.) 

i  t 


'  s 

t=o\  * 


/  ,  ,  n,*.  +“i  'J 
(  vT  +  f\)  l  1 


where 


i).  =  z.l/fz.-t)  !  tl  . 


Using  the  B<  /es*  formula  and  (12G)>  (127).  and  (128)  yields 


(129) 


f(s./z)  =  — 

1  z. 


2  (  J  I“i  +  %)]  *r(zi  +  V  l  ) 

1=0  \  l  / 


The  Bayes1  estimate  of  s^  is  then 


s  =  E(  s/  z)  =  jn — - — — —  — 

1  1  (p;  +  nT)  zi 

S 


zi  /  ' 

2  (  r  (zi+ui+l  “J) 

igolii  _ L* _ 


2  (Zi\  [n.(Q. 

/ 1  1  1 


+nT)]  ^r'vZjVU.-  z) 


(130) 


(131) 


For  n.  =  0  this  estimate  becomes  s.  =  (z.  +  v..)/  (ft.  +  r|T)  which  is  tb-^ 
i  i  1  r  1 

came  as  (124). 

The  estimate  of  the  object  x  using  the  estimate  of  the  object*-,'  ’m?r,e 


C  t  -1  iC 
x  =  (A  A)  As 


;.i  3 


where  s  = 


llWiWWfFi 
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*rd  (A1  A)  *A'  is  the  pseudoinverse  of  the  matrix  A  (Deutsch.  1965). 

The  direct  derivation  of  the  Bayes'  estimate  of  the  mean  rate  of  phoio.is 
emitted  from  the  object  is  very  difficult  and  will  not  be  considered. 


Maximum  A  Posteriori  Estimate 


When  no  costs  are  specified  in  an  estimation  problem,  a  reasonable 
estimation  procedure  is  to  maximize  the  a  posteriori  density  function 

.  /v 

*(y/  *)  =  f(y)  p  (z/y)/p(z)-  The  maximum  a  posteriori  estimate  y  ig 
defined  as  the  value  of  y  that  maximizes  f(y/z). 

The  maximum  a  posteriori  estitnate  of  y  is  found  by  solving  the  m 
simultaneous  equations. 


-  f(y/z)  s  0  (i  =  1,  2 . m). 

L 

Since  f(y/z)  is  a  monotonic  function,  £  nf(y/z)  has  its  maxim’.-.: 
same  values  of  y  that  maximizes  f(y/z).  Kencc,  v/o  can  reive 
0,,e:it  m  sinultaneoua  equations. 


(133) 


*cr  the 


he  r.  yn1  -- 


- in  f(y/s)  =  0  (i  si,  2,  ....  m). 

^i 

There  maybe  several  roots  of  these  ec,,n.tie-i3  r*  wWrh  cane  t’.v» 


r;* 

.  t 


soV’t'.^n 


y°that  yields  the  highest  peak  of  the  function  f(y/n)  must  be  chosen.  Sine  : 
the  denominator  of  f(y)p(z/y)/p(z)  does  not  depend  on  y,  mr.ximizv*j; 
f(y/z)  is  equivalent  to  maximizing  f(y)p(z/y). 

We  will  now  find  the  maximum  a  posteriori  estimate  o;'  y^  which  is 
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th 

the  average  number  of  photons  that  are  incident  upon  the  i1,  cell  or  region 
of  the  image  plane  per  unit  time.  Assume  that  the  a  priori  density  function 
of  the  mean  rate  of  signal-plus -noise  photons  y  is 


a. {a.  y.)Ui"*  c  "^i^i 

f(y)  =  FT  f(y.)  =  n  - 


m 


m 


i=l 


i=l 


r  (u.) 


y.  >  0 
1  — 


(135) 


=  0, 


otherwise. 


The  conditional  Poisson  distribution  of  the  output  photoelectrons  z  io 


m  m  e  **  ^i  (,nTy.)Zi 

p(z/y)  =  TT  pt^/y) s  FT  - —x — - — 

U1  i-l  i 


(1?A) 


Now  maximize  p(z/y)  f(y)  with  respect  to  y.  That  is. 


m 


—  In  p(*/y)f(y)  *  —  2  [  £nf(y  )  +  tap(z  /y)] 

^yj  Sy^  1=1 


'1371 


y. 

j 


-  a  +  — 

j  9. 

j 


r>~  s  0. 


[Ience.  the  maximum  a  posteriori  estimate  of  y  is 


XV  Z.  +  U.  -1 
11 

yi  - - 


or.  +  nr 


(135} 


Now  assume  that  the  a  priori  density  function  of  "he  nern  rr*-e  of 


signal  photons  is 
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\ 


m 

f ( 5)  *  "TT 

ini 


P.({$.  !*.)Ui  *  e  ^iSi 
11  1 _ 

r(«i) 


s.  >  0 

i  — 


*  0, 


otherwise. 


(139) 


Assume  that  the  noise  is  fixed  and  known.  Maximize  p(z/s)  f(«»),  where 
p(z/s)  is  given  in  ( 1 36) »  with  respect  to  s.  That  is. 


■s  _  ^  m 

- 1  n  p(z/s)f(s)  = -  S  [  Jtnf(s)  +  Jin  p(z  /s)] 

S5.  di.  i=l  1 

J  1 


(140) 


u.-l 
_  J _ 


z. 

-  &  -nT  +  ~r — t  =  °* 

J  -  ±  - 


s.  +  n. 
J  J 


Hence,  the  maximum  a  posteriori  estimate  of  s.  is 


s.  = 

i 


z.  +u.-l-n.(nT  +  0.)  +  \/[n.(TiT  +  0.)-z.+  u.-l]  4z.(u.  -1) 

i  i  i'  1  r  1  i'  1  i  i  i  J  ii. 


2(nT  +  0.) 


<\4." 


This  solution  becomes  the  same  as  (138).  as  it  shculd.  when  the  mean 
rate  of  noise  photons  n.  is  equal  to  zero.  Again  the  estimate  of  the  object 
x  using  the  estimate  of  the  objects'  image  s  is  that  of  (132). 

Consider  finding  the  estimate  of  x  directly  when  the  a  priori  density 
function  of  a  is  that  of  (139).  To  find  the  maximum  a  posteriori  estimate 
of  x  we  must  find  the  value  of  x  that  maximizes  p(z/g)f(s).  This  vain  ■' 
of  x  is  found  by  taking  a  derivative  of  p(z/s)f(i;  with  recrect  to  r.  and 
setting  the  result  equal  to  zero.  When  th-'s  is  done  we  obtain 


« 
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C  -1  ^ 

X  =  (A'A)  A*  s  .  (146) 

This  particular  solution  gives  the  same  result  that  was  obtained  by  first, 
finding  the  estimate  of  s  and  then  using  it  to  find  the  estimate  of  x.  V/ e 
must  keep  in  mind  that  for  the  present  case.  (144)  has  other  solutions 
besides  ( 145)  and  that  there  may  exist  several  relative  maxima,  of 
£n  p(z/s)f(s),  from  which  we  can  have  only  one  maximal  value.  It  is  quite 
nc^ible  that  the  solution  of  (145)  does  not  give  rise  to  the  absolute  max¬ 
imum  of  p(z/s)f(s)  in  which  case  (145)  would  not  be  the  maximum  a 
posteriori  estimate. 


Maximum  Likelihood  Estimate 


The  maximum  a  posteriori  estimate  der  ends  on  the  a  priori  prebabiiti 
density  function,  but  for  some  situatiors  no  such  a  priori  information  may 
be  available.  Under  these  circumstances  we  need  to  consider  <*.ie  max¬ 
imum  likelihood  estimate.  Also,  if  the  a  priori  density  function  is  I  vor.  ' 
and  flat  and  relatively  independent  of  y  over  the  region  where  p(z/y)  is 


significant  (i.  e. ,  initial  knowledge  of  y  is  very  small)  tk.cn  maxim4 
a is  nearly  equivalent  to  maximizing  f(y/z).  The  value  of  y  w.bic- 
maximises  p(z/y)  is  defined  as  the  maximum  likelihood  estimate  of  *. . 
S:nc.e  p(z/y)  is  a  monotonic  function  we  can  solve  either  of  the  tr/o 


equations  dp(z/y)/dY-  =  0  or  d  jmp(z/y)  /^y.  =0  for  the  estimate  of  y 

J  J 


.ve  have 
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— -  4np(z/y)  S -T  S  [z.  2n(nT  Yi) -r,T  y  -£  n(z  1)]  =  — ^ - nT  •  (147) 

djj  d'j  i*l  Vj 


Hence,  the  maximum  likelihood  estimate  of  y^  is 


(148) 


The  maximum  likelihood  estimate  of  o  is  found  in  the  same  manner 
as  above.  That  is, 


— tnp(z/y)  = — ^ —  -tit  =  0.  (.1-19' 

ds;  a.  +  n. 

J  3  3 


Hence,  the  maximum  likelihood  estimate  of  s.  ie 

z. 

i  =_j —  .a, .  (iso) 

1  T)  T  1 

The  estimate  of  the  object  x  using  the  estimate  of  che  object*  s  image  s 
is  the  same  as  in  (132). 

Now  consider  finding  the  maximum  likelihood  estimate  of  the 
object  x.  To  find  this  estimate  we  need  to  find  the  v;.lue  of  x  that  max¬ 
imizes  p(z/::)  or  alternately  ft  np(z/x).  Taking  the  derivative  of  an  p{n/r) 
with  respect  to  x  and  setting  the  result  equal  to  zero  yields 


m 

£np( z / x)  s  S  [- 
i=l 


m 


a.  x+  n. 
i  i 


-r|T ] a. .  =  3  Y.t . .  =  0 
lJ  i^l  1  l} 


n~  i 
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where  Y.  =z./(a.x  +  n  )  -tit  .  In  matrix  notation  this  is  written  as 
i  11  » 

Y'  a.  =  0  (jsl,  m)  (152) 

j 

th 

where  a.  is  the  j  column  vector  of  the  system  matrix  A.  Eouation 
J 

(151)  holds  for  j  =  1.2....  .m;  hence,  it  constitutes  m  equations  m  the 

m  unknowns  x. . x  which  we  want  to  estimate.  Using  matrix 

i  m 

notation  this  set  of  m  equations  can  be  written  as 

0 

0 


which  means  that  Y  must  be  contained  within  the  nui1  i  ~?.ce  of  A' .  The 
di  scussion  of  the  solutions  0i  (144)  for  the  maximum  a  posteriori  estimate 
of also  applies  to  (152).  We  will  consider  the  solution  Y  -  0  which  im¬ 
plies  that 

z. 

a  x  =  -4 - n.  .  (15") 

1  t|T  i 

For  gencr.  1  A  the  maximum  likelihood  estimate  of  x  is 


Z  =  (A’A)"lA'(-5-  -n)  =(a’a) 

nT 


"*1AI  (f-n)  =  (a'a'^a’A, 


(155) 
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This  is  the  same  estimate  that  is  obtained  when  we  estimate  e  first  and 

then  use  this  estimate  to  estimate  x,  it  should  again  be  pointed  out  that 

\ 

(153)  has  other  solutions  besides  (154),  one  or  more  of  which  may  give 
rise  to  a  maximum  value  of  p(s/x),  which  is  larger  than  the  value  due  to 

(154) ,  in  which  case  (154)  would  not  be  the  maximum  likelihood  est’mn*e. 


Discussion  of  Estimates 

The  Bayes'  estimate  and  the  maximum  a  posteriori  estimate  have 
received  some  criticism.  The  basic  argument  against  them  is  b&oed 
upon  the  requirement  of  a  priori  probability  density  functions  for  the 
random  variables  to  be  observed  during  an  experiment  The  maximum 
likelihood  estimate  has  objectionable  small- sample -size  properties 
(Deutsch,  1965). 

It  should  be  pointed  out  that  all  the  estimates  of  y.  of  this  section- 
riven  that  y^  has  a  gamma  distribution,  are  linear  estimates  (linear 
v  i' h  respect  to  the  observable  z  )  as  is  the  minimum  MSE  estimate. 
However,  the  Bayes'  and  maximum  a  posteriori  estimates  of:-,  rmd  t,., 
p: von  that  has  a  gamma  distribution,  are  not  linear  estimates.  The 
i  -ayes'  estimate  of  y,,  is  the  same  as  the  minimum  MS”  esli.mr.ve  cf 
rjiven  that  y  has  a  gamma  distribution.  The  maxifarn  a  po.?terhi;T 
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estimate  of  y.  differs  from  the  minimum  MSE  estimates  only  by  a  minus 

•r  e  in  the  numerator.  The  maximum  a  posteriori  estimate  hc^oinen  ih<» 

'same  as  the  Bayes*  and  minimum  MSE  estimates  for  large  values  of  u  . 

i 
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FIXED-SAMPLE  SIGNAL  DETECTION 

Introduction 

For  the  fixed-sample  detection  procedure,  we  will  consider  the 
Bayes1  decision  r'Ae.  By  definition,  Bayes1  decision  rule  is  the  decision 
rule  that  minimizes  the  average  loss. 

We  will  consider  only  situations  iu  which  there  are  two  possible  states 
of  nature,  and  We  will  assume  that  the  a  priori  probabilities 

p{  «  j)  and  p{  u^)  are  known.  Also,  we  will  assume  the  probabilities 
p(  z  /  w  .)  (i  *  1,  2)  are  known. 

We  observe  the  outcome  of  the  experiment  and  decide  which  state  of 

nature  is  present.  If  we  choose  Uj  as  the  state  of  nature  when  is 

the  true  state  of  nature,  we  lose  an  amount  c(  «.)  =  c.  We  want 

i  J  ij 

to  find  the  decision  rule  that  minimizes  the  average  loss. 

To  determine  the  form  of  Bayes1  decision  rule,  first  calculate  the 
average  loss  resulting  when  decision  rule  d(* )  is  used.  If  we  observe  z 
when  Wj  is  the  true  state  of  nature  we  lose  amount  c[w.,  d(z)]  .  Also, 
when  is  true,  z  occurs  with  probability  p(z/  «,).  For  the  average 
loss  or  risk  we  can  then  write 


00 

p(uifd)s  E  cfwj,  d(z)jp(z/ui). 
z=0 


(156) 
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Since  w.  occurs  with  probability  p{  w .),  the  average  loss  or  risk  resu'.fci" 
when  the  decision  rule  d  is  used  is 


2 

p(d)=  z  p(u>.,d)p(W).  (157) 

i=l 

By  definition  Bayes'  decision  rule  d*  is  that  rule  which  minimizes 
the  average  risk.  That  is,  we  require  that  p  (d#)  <  p(d)  foi  all  possible 
decision  rules  d. 

The  derivation  of  this  rule  follows. 


cm  minimize  p(d)  by  choosing  the  decision  rule  d  that  minimize;:  p 
for  each  z.  The  conditional  risk  can  be  written  as 


P2(d)  =  c[Wl,  d(z)]p(«^/z)  +  cfu>2,  d(z)]p(w2/z), 

In  accordance  with  the  decision  rule  we  must  choose  cither  d(z)  =  w  j  or 

d(z)  =  u  If  we  choose  d(z)  =  u  .  then  p  (d)  =  c.,p(u./z)  + 

Z  1  z  1 1  1 

c_.p(  w,/z).  Similarly,  if  we  choose  d(z)  =  w  _  then  p  (d)  = 

Z 1  Z  Z  z 
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'la*  tOj/z)  +  {  u^/z),  Bayes1  decision  rule  becomes:  choose 

“l  ifcnP(  +  C21P^w2^Z^  <  C12P*W/Z*  +  c22p^  W2^Z^  or 

by  using  Bayes1  formula  we  have: 


where 


choose  u>j  if  l  (z)  >  6, 
choose  (Og  if  t(*>  <6, 

6A  P±2)(cji^a\ 

-  pH)Vcu-cn; 


(160) 


(161) 


and 


t(z)  =  ^  2m)  = 


P^j*  Z2«****  Zm^ 
P(zj»  z2*  *  *  zm^ w2^ 


(162) 


is  the  likelihood  ratio  of  the  observation  and  p(z , . . .  z  /  cu . )  is  the 

1  mi 

conditional  probability  of  observing  "counts"  z,,  ...,z  when  u  is  the 

1  m  i 

true  state  of  nature.  We  have  assumed  1'iat  these  "counts"  are  independ* 
ent;  hence,  we  have 


m  Pb./u.) 

i  (z)  =n  'ii  \ 
i«l  P<Zi/w2) 


(163) 


We  will  assume  throughout  the  rest  of  this  paper  that  c^2  s  c?.  =  0, 

C12  =  C21*  and  that  P^  wl>  =  ?(  w  2^  =  1/2;  hence,  6=1. 

Since  the  natural  logarithm  is  a  monotonically  increasing  function  of 
its  argument,  the  logarithm  can  be  taken  of  both  sides  of  the  inequalities 
in  (160)  to  obtain  an  equivalent  decision  procedure.  Bayes1  decision  rule 


becomes: 
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choose  uij  if  L(z)  s  %xH(z)  >  0, 

(164) 

choose  o>2  if  L(z)  <  0. 

Our  problem  if  to  observe  the  "counts"  at  the  output  of  the  detector 
and  from  this  decide  which  of  the  two  signals  gave  rise  to  this  output. 
Figure  12  illustrates  the  problem. 

We  will  be  interested  in  assessing  the  error  associated  with  Bayes1 
decision  rule.  When  analyzing  the  error  probability,  we  will  consider 
throughout  the  remainder  of  this  paper  only  one  region  or  cell  in  the 
image  plane  where  measurements  will  be  made.  Th^s  assumption  is 
for  mathematical  convenience  :  nd  docs  not  introduce  a  serious  loss  of 
generality.  Hence,  for  error  analysis  our  problem  is  reduced  to  a  case 
of  a  scalar  signal.  At  the  end  of  this  section  a  comparison  will  be  made 
of  {he  error  probabilities  for  the  various  cases  that  will  be  considered. 

Two  Known  Signals 

Assume  that  at  the  input  of  the  detector  we  have  one  of  two  possible 
signals  with  known  mean  rates  and  y^  both  of  which  contain  any 
constant  background  noise  that  may  be  present.  For  convenience  in 
discussion,  we  will  refer  to  y^  and  y^  as  the  signals.  For  the  present 
case,  we  have  two  possible  states  of  nature* 

Wj  :  y  =  y^  (known), 

:  y  =  y2  (known), 
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where  y.^  >  y^.  Knowing  that  signal  y.  is  present  is  equivalent  to 
knowing  that  is  present.  The  objective  of  our  discrimination  precedure 
is  to  decide  which  of  the  two  states  of  nature,  u  or  w  ,  is  the  true 

1  Ct 

state  of  nature. 

The  probability  of  z^  photoelectrons  being  emitted  in  time  t  from 
th 

the  i  cell  of  the  image  plane,  given  that  w.  is  the  true  state  of  nature, 
is 

z.  -riTy  . 

foxy..)  1  e 

PUj/Wj)  *  - ^ -  (j=1.2)  (165) 


where  y„  is  the  i  component  of  signal 
Hence,  the  likelihood  ratio  is 


y,tt.  e. , 

J 


Y;  =(y 


jl' 


*  •  • » yji* 


)). 


*(z)  = 


ptz/wj) 

P(®/«2) 


^T(yu-y2i) 


(166) 


where  m  is  the  number  of  cell  or  measurement  positions  in  the  image 
plane.  A  more  convenient  form  is 


m 


L(z)  =  tn{(z)  =  E  [Zj  «n(yli/y2i)-nx(y1.-y2i)], 
l  --I 


(167) 


Bayes1  decision  rule  becomes: 

m  _  m 

choose  Wj  if  S  z.  <n(yu/y2i)  >  qt  2  (yj.-^.),  (168) 

i=l  isl 

choose  otherwise. 
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This  detector  has  the  form  of  a  digital  matched  filter  where  the  filter 

is  matched  to  This  detector  is  illustrated  in  Figure  13, 

The  above  case  may,  as  a  special  case,  be  considered  as  two  known 

signals  Sj  and  §2  (§2  may  car  may  not  be  zero  but  >  s imbedded  in 
known  noise  n.  The  two  possible  states  of  nature  for  this  case  are: 


“i  {  y =  +  & 

w2 :  y =  h*  * 


(known  signal  plus  noise), 
(known  signal  plus  noise). 


Bayes'  decision  rule  for  this  case  is: 

m  m 

,.+n. 


m  /  s^+n.  \  m 

choose  coj  if  S  «,  t V  S  (Ju.i2.). 

1=1  2i  i  1=1 


(169) 


choose  otherwise, 

where  s^.  is  the  i  component  of  the  vector  signal  Sj.  For  the  case 

where  s.  =  0,  the  noise  is  uniform  (i.  e. ,  fl.  =  n  for  all  i),  and  we  have 
L  10 

a  small  signal -to -noise  ratio  i^/n,  for  all  i,  Bayes1  decision  rule 


reduces  to: 


m 


m 


choose  w.  if  2  z.s,.  >r>Tn  S  s,.f 
1  «  ,u  °i=x  11 


choose  Ug  otherwise. 


(170) 


This  detector  has  the  form  of  a  digital  matched  filter  where  the  filter 
is  matched  to  the  signal  s^.  This  detector  is  illustrated  in  Figure  14. 

We  will  now  determine  the  error  probability  associated  with  the 
general  case  of  two  possible  known  signals  y^  and  y^.  As  mentioned 
earlier,  only  a  single  cell  of  the  image  plane  will  be  used  in  our  error 
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analysis,  thus  reducing  the  problem  to  one  of  a  scalar  signal.  Bayes1 
decision  rule  now  becomes} 

T)T(y,-y?) 

choose  w.  if  z  >  — 7-  -rg-r  =*v  (171) 

1  in(y1/y2) 

choose  otherwisef 
where  z,  y^,  and  y2  are  all  scalars. 

We  w;.ll  now  determine  the  error  probability  for  this  procedure.  The 
general  expression  for  the  error  probability  is 

Pe  =  p(u1)P(FD)  +  p(U2)P(FA)  (17?.) 


where  P(FA)  is  the  probability  of  saying  y^  is  present  when  y2  is 
present  (probability  of  false  alarm),  and  P(FD)  is  the  probability  of  saying 
y2  is  present  when  y^  is  present  (probability  of  false  dismissal).  As sum e 
that  p(  «  2)  =  p(  w  )  =  1/2.  Thus 

Pg  =  jP[  2  >  y/w2]  +  iP(z  <7/0^  ] 


.  -  ,2  ‘nTy2 
7  (nTy?)  e 
2  —  A 


z=0 


zl 


7 

+  2  • 

Zf-'C 


.^1 


(173) 


For  large  values  of  z  the  Central  Limit  Theorem  applies  and  hence 
z  becomes  approximately  Gaussian.  In  order  to  completely  specify 
the  Gaussian  densities  f(z'w  2)  and  f(z/  «  ^)  we  need  to  find  the  conditLu*>.l 
means,  E(z/  Uj)  and  E(z/ u  2),  and  the  conditional  variances,  var  (z/q 
and  var  (z/ «  2).  These  are  given  as  follows: 
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EU/wj)*  U]  =  T)TYV  U74) 

E(z/u^)  =  u  £  =  TVry2*  (175) 

Var  (z/wj)  =  <r2  =  t|Ty,,  (176) 

Var  (z/w2)  «  <r  2  =  tyry^,  (177) 


Hence,  using  (174)  -  (177)  and  assuming  a  Gaussian  probability  density 
function  approximation  wo  have 


P  « 


OC 

r. 


JjTsr. 

y  2 


2- 


(X-  IL)6 

-17?- 

e  °2  dx  + 


(x-Wi) 
Y  _ _ i 

f-L-e  ^ 

*v=r 


2 
MM 

1  dx 


(178) 


I-  00 


_  1 
■  I 


r  ± 

J 


-x2/2, 
e  dx  + 


/  ^ 
(-ry+u^yo-j 


1  -x  /2  . 

~  e  dx 


Under  conditions  where  the  Gaussian  approximation  holds,  this  form 
of  the  error  probability  would  be  very  useful  in  analyzing  the  error 
probability  for  the  general  case  of  m  cells  or  measurement  positions 
in  the  image  plane  (see  (168)). 

It  is  desirable  to  compute  the  error  probabilities  of  these  two 


methods  to  determine  how  good  the  Gaussian  approximation  is.  The 
Gaussian  approximation  turns  out  to  be  good  even  for  very  small  values 
of  z.  The  approximation  is  good  for  small  values  of  z  because  we  are 
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working  with  the  "tails"  of  two  distributions  and  when  they  are  added 

the  approximation  errors  of  the  distributions  compensate.  This  does 

not  hold  in  general  for  p(  ^  1/2.  For  an  indication  of  the  validity 

of  this  approximation,  see  Figure  15  and  Table  1.  Figure  15  compares 

the  ratio  of  the  Poisson  error  probability  and  the  Gaussian-approxima- 

tion  error  probability  for  different  ratios  of  the  signals  y^  and  y^,  The 

Poisson  error  probabilities  and  the  approximate  error  probabilities 

have  a  significant  difference  only  for  situations  where  the  error 

>3 

probabilities  become  small  (e.  g. ,  P  <10  ).  For  the  case  of  very 

e 

small  error  probabilities,  we  are  far  o’-.c  on  the  "tails"  of  the  distribu¬ 
tions  and  our  approximation  breaks  down.  The  approximation  breaks 
down  because  the  Central  Limit  Theorem  does  not  coverge  linearly  and 
hence  the  Gaussian  approximation  holds  on1./  Lr  the  center  portion  of 
the  distribution. 


Two  Unknown  Signals 

Now  assume  that  at  the  input  of  the  detector  we  have  one  of  two 
possible  signals  y^  and  y^,  both  of  which  contain  any  background  noise 
that  may  be  present.  These  signals  are  assumed  fixed  but  unknown 
and  having  each  of  their  elements  taken  from  statistically  independent 
gamma  distributions  with  known  parameters: 


P  (P)/P  (G) 


Table  1.  Poisson  error  probabilities  and  Gaussian-approximation  error  probabilities  for  the  cases  of 
discriminating  between  two  known  signals  and  discriminating  betv'een  two  unknown  signals. 
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y.  >0 


otherwise. 


(17;) 


Our  two  possible  states  of  nature  are: 


y  =  y^i  (unknown), 

w2:  y =  Yz  (unknown). 


where  y  >  y  .  (The  quantity  y  is  the  expected  value  of  y. )  We  want 
to  decide  which  of  these  two  states  of  nature  is  present. 

The  likelihood  ratio  in  this  case  is 


pfz/o^)  rptz/wj,  yjMypdyj 

1  {z)  =  iRTj  =  #  “ 


,,  yz)Hyz)*y2 


(180) 


Zi  -^ji 

22.  (nTy;;)  e  J 


where  -w-w-ri— “n 


(181) 


and 


p(z/u.)  =  P  p(z/w.,y.)f(y.)dy. 
o 

u. 


m  /  a..  \  Ji 

=nC-^r) 


zi 


i=l  vji  • 


(nT)  r  (z.+u  t) 
z.i(nT+a..)Zlr  (u..) 


(’82) 


0=1,  2). 


05 


Hence,  the  likelihood  ratio  becomes 


w.fj-  »uU»  (^2  2ir(«t^)  rtu2i) 

‘"W  -  U2i/„  .  V”T+“li;  ‘ 


a2i  teu+nT)  r<VVr(un) 


Assume  for  convenience  that  u,.  =  u_.  =  u.  for  all  i,  then 

li  2i  i 


_  ^  u.  z.+u. 

l{z),TjrjLyrj^L)1 1 

i=iVo,2iy  V  “n+^  ' 


or 


m 


L(z)  =  lnl{z)  =  E 
i=l 


*xH'T\  .r 2i 


Bayes’  decision  rule  becomes: 


m  /tt2i+T,T\  m  f  /°V  \  /aV+1Vr\" 

choose  w.  if  2  z  tn( — — - )  >  S  u.  taf  >-ftn(  - ) 

1  i=l  1  W*1* J  i=l  l[  W'  Vaji+nT/J 


(183) 


(184) 


•  (185) 


.  (186) 


choose  u  otherwise 

Cm 
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For  this  case  the  detector  has  the  form  of  a  digital  matched  filter  where 
the  filter  is  matched  to  £  n|(  +  iyr  )/  («  +  flT^  This  detector  is 

illustrated  in  Figure  16. 

For  an  error  analysis,  we  again  consider  the  single  cell  case. 

Bayes'  decision  rule  for  this  case  becomes: 


choose  if  z  >  u 


«n{a2/«1) 

or2+r,T 


choose  u.,  otherwise* 


The  error  probability  is 


(187) 


P 

e 


(nru/a/ 

zl 


<y  (t|T*u/a 

+  S - 

z=0 


-t|Tu/a 


(188) 


Again  assume  a  Gaussian  approximation  for  z.  The  condition:!  rreaus 
and  variances  necessary  to  specify  the  Gaussian  densities  f(::/  1 

t  _ 

and  f(z/  Wj)  are  as  follows: 


E(z/«:)  a  E[E(z/wi# y4)]  a  E(iyry.)  =  rjTu/or., 


multiple  cell  threshold  detector  designed  for  the  two  unknown 
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E(z/Wj)  *  Uj  *  HTU/Clp 


(190) 


E(z/ci»2)  ■  v2  *  nTu/o2, 


(191) 


Var  (z/t^)  =  E[Var(z/ui,yi)]  +  Var[E(z/a»1,yi)] ,  (192) 


2 

Var(z/u^)  ■ 


HTU/Oj 


2 

u, 


(193) 


Var(z/u^) 


ntu/a2  +  (nT/a2)  u. 


(194) 


Hence,  using  ( 189)“(  194)  and  a  Gaussian  probability  density  function 
approximation  we  have 


<-  00 


-x2/2 


dx  + 


00  y 

1  -XV2, 
/Zi  e  dx 

■741*1 


w 

f 


e 


(195) 
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We  need  to  compute  the  error  probabilities  for  these  two  cases 
(Poisson  and  Gaussian-approximation)  and  compar  them  to  determine 
how  good  the  Gaussian  approximation  is  for  the  two -unknown*  signals 
case. 

See  Figures  17  and  18  and  Table  1  for  a  comparison  of  the  Poisson 
error  probabilities  with  the  Gaussian-approximation  error  probabilities. 
Figure  17  shows  the  Poisson  error  probabilities  and  the  Gaussian- 
approximation  error  probabilities  for  the  cases  where  both  signals 
are  known  and  l  ■.  signals  are  unknown.  The  Poisson  error  probabil¬ 
ities,  known  and  unknown  signals,  coincides  with  the  Gaussian- 
approximation  error  probabilities  for  known  signals.  The  Gaussian 
approximation  for  unknown  signals  is  reasonably  accurate  only  for 
cases  where  the  variances  of  the  unknown  signals  are  smaller  than 
about  0.  1.  Figure  18  compares  the  ratio  of  the  Poisson  error  probability 
and  the  Gaussian-approximation  error  probability  for  different  ratios 

of  the  expected  values  of  the  signals  (i.  e. ,  y^  and  y^)  and  different 

-  =  2  , 

values  of  u  (note  that  var  y.  =  y.  /  u  (i  =  1,  2)). 

One  Unknown  Signal  and  One  Known  h»?  ;nal 

Consider  the  case  of  having  present  at  the  input  of  the  detector  pi' her 
the  known  signal  y  or  the  unknown  signal  y^.  For  this  case  we  have  th^ 
two  possible  states  of  nature: 
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a*  y  =  yj  (unknown), 
i  y  =  y,  (known), 

4  4 

where  >  y^.  We  consider  this  case  because  of  its  advantages  in 
analyzing  the  detection  error  from  which  we  can  gain  additional  insight 
to  the  detection  process.  This  case  corresponds  to  physical  cases  of 
large  signa1  -to-noise  ratio  in  which  we  neglect  the  "oise  and  discriminate 
between  two  signals.  The  unknown  signal  is  assumed  fixed  but  unknown 
and  having  each  of  its  elements  taken  from  statistically  independent 
gamma  distributions  with  known  parameters: 


_  V1  "Vi 

,<yl>  -VHyn)  - yi-° 


us-i) 


*  0, 


otherwise. 


Bayes1  decision  rule  is: 


p(z/(i>-) 

cho.se  sij  if  Us)  »  p(77^Y  >  l, 


choose  u>2  otherwise, 


where 


p(z/u2) 


m  (nTy,.) 

FT — -~ 


i«l 


(19 
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and 


v(z/^)  -Jp<*/v?1)f<*i>d*1  “]7i<ai+"*T>  1 7 


(nt)  r(zJ4ajj) 


i!(nT^ai)  if(ui) 


(199) 


The  likelihood  ratio  becomes 


nT*2i 

,  ,  tt  >  °i  »ui  r(zi+  6  .  (200) 

i.l  0i+nT  [y21(nt^a1)]"ir(u1) 

Taking  the  logarithm  of  the  likelihood  ratio  yields 

m  r  ”i 

L(z)  -  ini(z)  »  £  I  u^in(a~ — )- inr(u^)+inr(Zj+Uj) 

1  (201) 

-zi  in[y2i(nT-kii)]  +hTy21J. 


Bayes1  decision  rule  becomes: 


cr 


oose  if  inr(zi+u1)-zitn(y2i(nT+G1)J  j 


m  a.+rvr  _  .. 

>  I  Unr(u  )  +  u  ln(  — — J-nty, 

1  1  J 


choose  Uj  otherwise. 
Using  the  approximation  that 


(202) 


in  r(x+l)  ~  kin2ir-x+(x^)  Inx 


(20") 
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we  can  write  for  Bayes'  decision  rule: 


choose  if  l  [(z^u^Jj)  in^+u^l)-*^  inly^tnT-ki^l-zJ 
i»l 

(204) 


m  f  a.+rrr 

>  Z  (u.->i)£n(u.“l)  +  u.£n  ( — - — )-nty.  |, 
i«lL  1  x  i  J 

choose  otherwise. 

This  detector  is  illustrated  in  Figure  19.  For  the  single  cell  case, 
Bayes'  decision  rule  becomes: 


choose  if  (z+u-k)  £n(z+u-l)- z  fnlj^nT-kit) '  -z 

a+nr  <205) 

>(u-h)  £n(u-l)  +  u£n(~“)-nTy2, 

choose  <>>2  otherwise. 

To  analytically  find  the  error  probability  of  this  decision  rule  as  was 
done  in  the  previous  cases  appears  very  difficult.  For  this  case, 
computer  simulation  (Monte  Carlo  method)  was  used  to  analyze  the  error 
probability.  The  results  are  discussed  later. 

Unknown  Signal  Imbedded  in  Known  Noise 

Consider  the  case  where  we  have  one  unknown  signal  5  which  if 
at  the  input  of  the  detector  is  imbedded  in  known  noise  n.  The  two  possib1' 


states  of  nature  are: 
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y  =  5  +  n  {unknown  signal  plus  noise), 
y  =  n  (noise  alone). 

The  signal  s  is  assumed  fixed  but  unknown  and  having  each  of  its 
elements  taken  from  statistically  independent  gamma  distributions  with 
known  paranletels: 


V1  "Si5i 

m  mg. (6,1.)  e 

f(s)  “  TT f(8.)  -TT - ,T7— r - ,  8.>0 

i-1  i-1  1  'tti'  1 


(206) 


0, 


otherwise. 


For  this  case 


p(z/u2) 


m  (nrn.)  *  e 

TT-  -1—- r - 

i«l  z-, ! 


(207) 


and 


P  (2  / 1^2 ,  s) 


-  TT  1  1 

1-1 


-nr  <W 
e 


(2CS) 


Hence, 


p(zi‘r5)  -A-  /VVf1  -mt, 

p(z/u2,s)  "  i„i  V  ni  J  e 


t(z/s) 


*209) 
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Our  likelihood  ratio  then  becomes 


J 


t  (z)  =J  e(z/s)f(S)ds 


■j— r/  PiV  jp( 

"''VPi+nV  ln.(6,+rrr)]*i  r(u.) 
i«l  1  :  1 


Bayes'  decision  rule  becomes: 


choose  u  if 
1 


m 

Z 

i-1 


in 


J 

^ni^6i+nT^  r^z{Hl 


-2 


i 


inln^Bj+nx)  J 


m 

>  l  lu.J. 
i-1 


inr(u1)] 


choose  2  otherwise. 

Using  the  approximation  of  (203)  Bayes'  decision  rule  becomes: 


m  f  r  zt  [ nt ( B t+nT  )  ]  ^  r  fc+Uj-j) 
choose  w,  if  I  I  in  Z  - 7~ — - 

n  i-iL  *- j-o  <Vj)!  j! 


+  z^[£n  2^-l-£n[n^(8^xnT))  .  +  U9m  z^ 


m 

>  Z  '(u.-ij)  £n(u.-l)-u.+l] , 
i-1 

choose  Uy  otherwise. 


•  (210) 


(211) 


(212) 
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This  detector  is  illustrated  in  Figure  20. 

For  the  single  cell  case  Bayes'  decision  rule  becomes: 


choose  w.  if  |n[  S IsfaTrf.Ifc  1 

1  Lj=0  jl  J 

+  z|tn  z-1- 2n[n{|3+r|T)]  j+  j  ^n  z  (213) 

>(u-|)tn  (u-1)  -u  +  1, 
choose  otherwise. 

To  find  the  error  probability  of  this  decision  rule  analytically  appears 
extremely  difficult,  if  not  impossible.  For  this  case  it  is  hard  to  use 
appropriate  approximations  without  approximating  the  problem  away. 
About  the  only  course  left  open  is  to  simulate  the  problem. 

Estimator-Correlator 

We  have  been  considering  the  optimum  detection  (Bayes1  decision 
rule)  of  Poisson  signals  with  unknown  parameters  (mean  rates)  where  the 
a  priori  probability  distributions  of  the  unknown  parameters  are  available 
at  the  receiver.  It  follows  from  the  optimality  of  the  solution  that  it  is 
not  possible  to  improve  the  detection  performance  by  estimating  the 
unknown  signals  first  and  then  using  these  estimates  in  a  detector  as  if 
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they  were  the  true  input  signals.  Kailath  (1963)  has  shown  that  for  th^ 
special  case  of  Gaussian  signals  with  unknown  parameters  the  optimum 
detector  can  be  interpreted  as  a  minimum  mean-square-error  estimate 
of  the  signal  followed  by  a  detector  that  treats  the  estimate  as  the  true 
value  of  the  input  signal.  We  will  refer  to  this  type  of  detector  as  an 
estimator-correlator.  The  results  of  the  Gaussian  signal  case  do  not 
apply  in  general  to  other  signal  distributions. 

The  question  arises  as  to  the  difference  between  the  Bayes'  decision 
rule  and  the  decision  rule  associated  with  an  estimator-correlator  when 
the  unknown  signals  have  conditional  Poisson  distributions  with  unknown 
mean  rates  which  have  known  gamma  distributions. 

Consider  the  case  of  receiving  at  the  input  of  the  detector  one  of  two 
possible  signals  with  conditional  Poisson  distributions  and  fixed  but 

unknown  mean  rates  y^  and  y^.  Each  of  th  _ *  cites  are  taken  from 

gamma  distributions  with  known  parameters.  For  this  case  Bayes' 
decision  rule  (scalar  car  e)  was  found  to  be: 


choose  u 


1 


if  z  > 


u 


f 


=  y> 


(214) 


choose  otherwise. 


101 


For  the  known  signal  case,  Bayes'  decision  rule  is: 


choose  u 


choose  u 


1 

2 


nT(Yi  -  v2) 


otharwiee. 


(215) 


Hence,  for  the  estimator-correlator  with  which  we  make  a  comparison 
we  have: 


A  ^ 


nT(yi  -  y2) 

choose  o>.  if  z  >  : — ,2  .  . 
1  ^n(yi/y2) 


choose  u  otherwise. 

Cm 


(216) 


A  A 

where  y^  and  y^  are  estimates  of  the  unknown  mean  rates,  and  here  we 
consider  these  estimates  to  be  the  true  mean  rates. 

Using  the  assumption  that  y^  and  y2  are  taken  from  known  gamma 
distributions  our  estimates  of  them  are  as  follows: 

1)  Maximum  a  posteriori  estimate 


z  +  u.-l 
£  _  _ i 

yi  "  a.  +  t)T 


(i  =  1.  2), 


(2:-) 
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2)  Bayes*  estimate 


✓v  z  +  u, 

-  1 

~  ttj  +  r)T 


3)  Minimum  mean-square  error  estimate 


z  +  u. 
i 

a.  +  iyr 


(i*  1.2), 


4)  Maximum  likelihood  estimate 


(i  =  l,2)  . 


(218) 


(219) 


(220) 


It  is  observed  that  the  Bayes'  estimate  and  the  minimum  MSE  estimate 
are  equal.  The  maximum  a  posteriori  estimate  differs  from  the^e  two 
estimates  by  a  minns  one  in  the  numerator  and  for  u  »  1  all  three 
f  r.timates  become  approximately  equal. 

Consider  the  maximum  a  posteriori  estimate  in  the  estimator  - 


correlator.  The  decision  rule  becomes: 
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choose  w 


1 


if  z  > 


(221) 


choose  q_  otherwise. 

u 


When  the  Bayes1  estimate  and  the  minimum  MSE  estimate  are  used 
in  the  estimator-correlator  the  decision  rule  becomes: 


u( — - — - — ) 

or.+T|T  a_+r|T 

choose  w,  if  z  > - - - - -  .  (222) 

1  Jmj£L) .  ( _ ! _ U 

'sar^+tlT  >  o-j+riT  ar2+riT/ 

choose  otherwise. 

The  thresholds  for  the  above  two  decision  rules  differ  only  by  the  terms 
u  and  u-1.  For  u  »  1  the  two  threshold  become  approximately  equal, 
and  hence  the  decision  rules  become  approximately  the  same.  The 
maximum  likelihood  estimate  does  not  work  in  this  case  because  there 
is  no  a  priori  information  contained  in  it  and  hence  there  is  no  way  to 
distinguish  between  y^  and  y2.  A  comparison  of  the  error  probabilities 
resulting  from  the  decision  rules  of  the  estimator-correlators  with  the 


error  probabilities  resulting  from  Bayes1  decision  rule  v/ill  follow. 
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Comparison  of  Error  Probabilities 

In  this  section  comparisons  will  be  made  of  calculated  and  simulated 
error  probabilities  resulting  from  the  various  fixed-sample  detectors 
that  we  have  considered. 

Figures  21,  22,  and  23  show  the  calculated  error  probabilities  for 

the  two-known-signals  case,  the  two- unknown- signals  case,  and  both 

cases  of  the  estimator-correlator  for  various  ratios  of  the  signals  y^  or  I 

y_  and  three  values  of  u.  As  u  increases  (var  y.  decreases)  the  error 
^  1 

probabilities  of  the  two -unknown- signals  case  and  the  two  cases  of 
estimator-correlators  approach  the  error  probabilities  of  the  two-krown- 
signals  case,  and  in  fact  they  all  coincide  for  u  =  100.  The  Fayes 1  and 
minimum  MSE  estimator-correlator  appears  to  be  superior  to  the  ronx’mtin 
a  posteriori  estimator-correlator  for  small  values  of  u.  Figures  24 
and  25  correspond  to  Figure  22  where  the  measuring  interval  ( rjT  )  has 
been  increased  to  2  and  3  respectively.  Figures  26,  27  and  28  are  tho 
computer- simulated  error  probabilities  corresponding  to  Figures  22,  24, 
and  25.  The  simulated  results  compare  very  favorably  v/ith  the 
calculated  results.  For  both  the  calculated  and  simulated  results,  t!'o 
Bayes'  and  minimum  MSE  estimator-correlator  is  consistently  be‘t~- 
~han  the  maximum  a  posteriori  estimator-correlator.  Figure  29  r!:o-vs 
:he  error  probabilities  for  the  case  of  one  unknown  sign.'1,  in  known 
noise  and  the  case  of  one  unknown  signal  and  one  known  s'r'rsal.  Th-  'r 
.  rr or  probabilities  are  compared  with  the  simulated  error  prr.bu 


(I  >S 


Figure  24. 


Calculated  error  probabilities  versus  ratio  of  y^,  and 

y  for  various  detectors,  u  =  10,  y  =  10,  and  tj  t  =  2. 
7  2  ^ 


Figure  26.  Computer-simulated  error  probabilities  versus  ratio  of 
and  for  various  detectors,  u  -  10,  =  10,  and  ^  T  = 


•  Two  known  signals 
O  Two  unknown  signals 

a  Bayes'  and  minimum  MSE  estimator-correli 
x  Maximum  a  posteriori  estimator-correlator 


Figure  27,  Computer-simulated  error  probabilities  versus  ratio  of  y 
and  y  for  various  detectors,  u  =  10,  y_  =  10,  and  tj  T  = 
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of  the  two-known- signals  case.  One  thousand  trials  were  nade  for 
each  of  the  simulated  results. 

In  general,  if  the  error  probabilities  do  not  become  less  than  about 
0.  1,  it  can  be  seen  that  knowing  the  signal  parameters  (mean  rates)  is 
not  much  better  than  only  knowing  their  probability  distributions, 
particularly  if  the  Bayes'  decision  rule  or  the  Bayes  estimator-correlator 
is  used.  In  many  problems  of  signal  detection,  error  probabilities  «  .  1 
are  important.  Errors  of  this  size  were  more  difficult  to  calculate  on 
the  computer  because  of  overflow  problems  and  hence  fewer  results 
were  obtained  for  these  small  errors.  For  error  probabilities  of  thi** 
size,  the  difference  in  error  probabilities  of  the  various  fixed-sample 
detectors  becomes  significant  (see  Table  1). 


SEQUENTIAL  DETECTION 


Ur> 


Introduction 


In  many  problems  of  optical  data  processing,  detection  speed  is 
important.  For  these  situations  a  sequential  detection  procedure  should 
considered  as  an  alternative  to  the  conventional  fixed-sample  de- 
..v-tion  procedure.  The  sequential  detection  test,  which  was  developed 
by  Wald  (1947),  minimizes  the  average  test  length  for  given  P(FD)  and 
?(FA)  and  hence  has  a  shorter  average  test  length  than  the  fixed- 
sample  test.  In  the  sequential  test  we  introduce  two  thresholds  at 
the  output  of  the  detector  such  that  we  declare  the  "signal  present"  if 
one  of  the  thresholds  is  exceeded  and  "signal  absent"  if  we  fall  bel1'’-'.' 
the  other  threshold.  The  number  of  observations  or  test  length  is  net 
fixed  in  advance.  The  number  of  observations  required  by  the  sen’v'n ;d’ 
test  depends  on  the  outcome  of  the  experiment  and  is,  therefore,  net 
predetermined  but  a  random  variable. 

Let  p(z/u>j)  be  the  probability  distribution  of  the  observed  random 
variable  z  when  u>  is  the  true  state  of  nature  and  p(z/u  )  be  the  pr 
bility  distribution  when  o  is  the  true  state  of  nature.  If  V’3  make  j 

w 

successive  observations  of  z,  the  probability  density  for  the  sample 

(-  ,  z  ...,  z^)  is  given  by  p.(z/w^)  =  p(z  ,...,  zJ/«j)  v  hen  is 

J 

trv-e  5*"te  of  nature  and  p.(z/u  )  =  p(z*  zVu,)  when  »•>.,  is  the.  t;  ■: 

j  Z  c 

r*'  te  or  rc*  The  onantity  t?  i:  »  vecte-  e1  ^ 

ou?  v.t  of  r,.+ch  r  "  Ihn  n  o*  the  *  v,  ?e  ;  ’  *  ^  {■ .  v.  -  ’  *  *  7 

1  £ 
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We  need  to  determine  two  positive  constants  A  and  B»  both  depending 
on  the  error  probabilities  P(FD)  =  j3  and  P(FA)  =  a  which  are  specified 
beforehand.  P(FD)  is  the  probability  of  false 'dismissal  and  P(FA)  is  the 
probability  of  false-alarm.  The  constants  A  and  B  have  the  approximate 
realationships  (Wald.  1947): 

A  *  1  -  P  (223) 

a 


and 


B  * 


(224) 


Bussgang  and  Middleton  (1955)  consider  the  sequential  probability 
ratio  test  (SPRT)  for  testing  against  The  definition  of  the  SPRT 
for  the  m  cell  case  is  as  follows:  at  each  stage  of  the  experiment,  compute 
the  probability  ratio 


p.fz/uj) 

Pj(z/u>2) 


p(z*,z^,  ....  z^/w^) 

“ “1  2  77”  %  ' 

p(z  »z  « #  *  *  >  z  /w^) 


(225) 


choose  if  t.(k)  >  A, 
choose  **  jt(k)  <  B, 

continue  the  experiment  by  making  another  observation  if  B  <  n(j)  <  A. 
We  are  also  assuming  that  the  number  of  electrons  being  emitted  from 
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the  different  regions  or  cells  of  the  image  plane  are  independent.  Hence, 
given  the  probability  of  electrons  being  emitted  from  region  1, . . . , 
*m  electrons  from  region  m  all  during  the  j  observation  intervals  is 

m 

p.fz/w  >  =*  TJ  p(z  /  }  (q  s  1§  2)  (22'7) 

J  4  t=l  *  q 

where  m  is  the  number  of  cells  in  the  image  plane.  The  likelihood  or 
probability  ratio  can  then  be  written  as 


30)  * 


m 


Pj(a/wj) 

fag s  TJ, 


fob- 


(227) 


It'  the  j  successive  observations  are  statistically  independent,  the  prob¬ 
ability  ratio  becomes 


p(zl,...,zj/w  ) 

*U>*  - - -JL 

p(z 

b 


i 

*77 

i=l 


m  p(z  Vu  ) 

TT  — V-^' 

tel  p(z.  /«..) 


(228) 


The  natural  logarithm  of  this  ratio  may  be  taken  for  computational  con¬ 
venience.  The  probability  ratio  can  then  be  expressed  as 


a  j  m  PtejVw.)  j 

R.  =  £n  *(j)  =  E  S  m  —  -V  1  =  S  r.  (22?) 

J  i=l  1=1  P(^%2)  i=l  1 


where 


m  P(z£Vw.) 

Ti  e  T'  £n  - J - •  {230) 

i=l  P(®fc  /«2) 

The  test  procedure  then  becomes: 
choose  Qj  if  >  in  A, 

choose  «2  if  in  B, 

make  another  observation  if  tnB  <  R.  <  tnA. 

j 

In  our  evaluation  of  this  test  procedure  we  are  primarily  interested 
m  the  average  number  of  samples  required  to  terminate  the  test  and  the 
Operating  Characteristic  Function  (OCF).  The  OCF  (L(y))  is  defined  as 
the  probability  of  choosing  «2  when  the  actual  signal  is  y.  Bussgang  and 
Middleton  (1955)  show  that 


(232) 
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The  quantity  L(y)  is  needed  in  the  C&lcutation  of  the  Average  Sample  Numbe 
(ASN).  The  average  value  of  R  {see  (229))  is  equal  to  the  value  of  the 
bounds  (thresholds  in  A  and  inB)  weighted  by  the  probability  that  they 
will  be  reached.  That  is. 


E(Rk)  =  L(£)  tn  B  +  [l-L(y)}  in  A. 


Also. 


ECR^  =  E| 


m 

E  tn 
1=1 


5i FF2 


9 


For  statistically  independent  observations  we  have 


(233) 


(234) 


k 

E(Rk)  =  E(  2  r^)  =  kE(r)  (235) 


where  r^  is  given  in  (230). 

The  ASN  for  statistically  independent  observations  becomes; 


r  j3  In  B  +  (1-p)  in  A 
E(r) 
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for  Uj  as  the  true  state  of  nature* 


r  _  a  gi  A  +  (1-g)  m  B 

E&) 


(237) 


for  u>2  as  the  true  state  of  nature*  and 


k*  L(y)tnA+  ri-L(y)l£n 

E(r) 


(23fP. 


for  general  signal  values  y.  When  the  observations  are  not  statistically 
independent,  the  ASN  is  found  by  solving  for  k  from  the  following: 


P  «n  B  +  (l-p)f  n  A  r  E  Sin  . 


pk,zi/“i> 


(239) 


for  Wj  as  the  true  state  of  nature. 


fm  p.  (z  /«  ) 

a  in  A  +  (l-oi)in  B  =  El  E  in  L 


1=1  Pk(V“2> 


(240) 
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for  h>2  as  the  true  state  of  nature,  and 

m  p  (a  /«  ) 

L(y)  tnA  +  [1  -L(y)]  fa  B=E  2  In  V  {241) 

tel  *V  r  2f_ 

for  general  signal  values  y. 

Two  Known  Signals 

We  will  first  consider  two  known  signals  and  as  the  possible 
input  signals  to  the  detector  where  y^  and  y^  are  the  average  number  of 
photons  that  are  incident  upon  the  image  plane  or  detector.  The  signal*? 
y  and  y  contain  any  constant  background  noise  that  may  be  present.  For 

1  4 

this  case  we  have  the  two  states  of  nature: 

Wj:  y  =  Yj  (known) , 

«2:  y  =  y2  (known), 

where  y.  >  y_.  The  objective  of  our  sequential  test  procedure  is  to  de- 

X  4 

cide  which  of  the  two  states  of  nature,  or  is  the  true  state  of 
nature. 

The  probability  of  z  ^  photoelectrons  being  emitted  from  the  l  1 
cell  of  the  image  plane  in  time  r  during  the  ith  observation,  give n  that 


k 
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b)  is  the  true  state  of  nature,  is 

q 


p{z  v«  )  =  — “-j -  (q  =  1.2) 

4  Z  o  l 


(242) 


—  tK  •  x 

where  y  is  the  t  element  of  signal  y  and  z.  is  the  actual  number  of 

qt  q  *■ 

tii  tH 

photons  emitted  from  the  &  cell  during  the  i  observation. 

Using  (228)  and  (242).  the  likelihood  ratio  can  be  written  as 


S  z 


«.ft oy=‘ 

£=1  '  y2  l  ' 


(243) 


The  logarithm  of  £(j)  is 


m 

R.  =  2 
3  *=1 


r  J 


I  s  z't|  to  Gulyu)  - nTi  (ylt  -y2t) 

1=1 


(244) 


Equation  (244)  can  also  be  written  as 


m 

Rj  *  Rj.l  +  *  [Ez  '>T<'V 

t  “1 


(245) 


where  R  =  0. 
o 
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The  SPRT  procedure 

is: 

? 

choose  u  .if  R,  > 

to  A. 

? 

choose  Wg  if  Rj?  < 

to  Bi 

continue  testing  if 

to  B  <  R.<  tn  A. 
1 

In  order  to  find  the  ASN  for  this  test  we  must  solve  the  following 
equation: 


E<Rj.)  =  L<>;  t»B+  [l-Ljy)]  f  n  * 


(246) 


Using  (244)  we  have 


m 


E(Rk)  .  kflT  2  [yt  tn(yi,  /y2 ^-'y, 


(247) 


Hence ,  for  the  ASN  we  have 


I >  # 


S"  I 

W 1 


£(y)  =  L(y)  tn  B  +  fl-L(y)l  «i  A 


^rS  [y  ‘n(yn/y2tHYlt-y2t>l 

£-1 


(248) 


T 

in 


for  signal  y  in  general* 
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and 


MxU.nB+  [1-My)1  m  A 
7[?  SA(yj /92)-y1  +  Tz ] 


(253) 


The  quantity  L(y)  is  found  from 


My)  ■ 


(254) 


where  h  is  chosen  such  that 


oo 

S 

z—0 


pCz/Oj) 

P(z/«2) 


(255) 


This  equation  can  be  solved  to  yield 


?» 


<y?2ih 

[(y,/y2)h-»] ' 


(256) 


By  choosing  values  of  h  we  can  plot  My)  versus  y.  Using  the  corres¬ 
ponding  values  of  My)  and  y  we  can  determine  and  plot  k(y)  versos  y.  For 
the  special  case  of  h  s  o 
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L(  y)  =  in  A  /  in  (A/B) 


(257) 


and 


k(y)  = 


-  in  A  in  B 


e  [  [z  My1/y2)-riT(yi  -y2)]2  ] 
_  -  in  A  in  B 


(258) 


[[var(z)+E2(z)]in2(yl /y2)-E(z)2  te(yj -y2)+(^T)2(yl -y2)2  ] 


An  example  is  shown  'a  Figure  30  of  the  calculated  ASN  versus  actual 
signal  values  y  for  a  sequential  detector  designed  for  two  particular  sign.M 
mean  rates  whose  preassigned  values  of  a  and  (3  are  equal.  Figure  31 
shows  the  OCF  for  the  same  example  and  sequential  detector  (OCF  is 
defined  as  the  probability  of  choosing  state  u>^  to  be  present  when  the 
actual  signal  is  y).  Figures  32  and  33  are  simulated  results  corresponding 
to  Figures  30  and  31  respectively.  As  the  ASN  becomes  larger  the  cal-* 
culated  and  simulated  results  compare  more  favorably.  This  is  due  to  the 
approximation  of  the  thresholds  A  and  B*  which  becomes  more  accurate 
as  the  ASN  becomes  large.  Figures  34  and  35  show  calculated  ASN  and 
OCF  versus  actual  signal  values  y  for  a  sequential  detector  designed  io  * 
two  particular  signal  mean  rates  whose  preassigned  values  of  a  and  (3  are 
not  equal.  For  this  example  the  detector  is  inclined  to  make  decisions 
in  favor  of  signal  y^  being  present  since  f<  r  p  >  a  the  detector  guards  less 
against  false  dismissal  (saying  state  u>2  is  present  when  is  the  true 
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Figure  31.  Calculated  OCF  versus  actual  signal  values  y  for  a 
sequential  detector  designed  for  two  known  signals* 
various  values  of  a  =  p*  y*  s 


Figure  32,  Computer* simulated  ASN  versus  actual  signal  values 
y  for  a  sequential  detector  designed  for  two  known 
signals,  various  values  of  «  =  £  ,  y  -  20,  and  y,  =  10 
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state  of  nature)  than  it  does  against  false  alarm  (saying  state  Uj  is 
present  when  w  is  the  true  state  of  nature).  One  hundred  trials  were 
m  ade  for  each  of  the  simulated  results  of  Figures  32  and  33. 

The  case  just  discussed  may,  as  a  special  case,  be  considered  as 
having  two  known  signals  5^  and  5^  imbedded  in  known  noise  n.  The  two 
states  of  nature  for  this  case  are: 


«1*  y  a  ij  +  n  (known  signal  plus  noise). 


to i  •  y  s  5  +  n  (known  signal  plus  noise). 


where  s.  >  s_.  The  values  of  5,  may  or  may  not  be  zero. 
12  « 

Using  these  conditions  for  the  single  cell  case  we  have: 


-nr  j  (Sj^  -s2) , 


(259) 


(Sj* s2) , 


(260) 


k  - 


L(i)  tn  B  +  n-L(s)1  An  A 

m  f;  *"(sj+s)‘  <;r:2)] 


(261) 


i  J4 


L(s) 


Ah-1 

Ah-Bh 


(262) 


and 


8  ■ 


(s^  -  s2)  h 

-  iT\  h  *1 


K 


(263) 
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Two  Unknown  Signals 


Assume  that  we  are  to  discriminate  between  two  signals  y.  and  y 

1  fa 

both  of  which  contain  any  background  noise  that  may  be  present.  These 
signals  are  fixed  but  unknown,  and  each  of  their  elements  are  taken  from 
statistically  independent  gamma  distributions  with  known  parameters: 


J»  _  ra  a  .  (a  .y  .) 

f(y„>  -IT  f(yqt>  -FT  ^ 


<u 


£-1 


£•1 


r%£> 


Vyq£ 


(q-1,2),  yq£>0  (264) 


0, 


otherwise . 


Our  two  states  of  nature  are: 


till!  W  jk  '«**»;  ul1  “Urt 


135 


WjS  y  s  j'j  (unknown), 
w  2:  /  =  y2  ( unknown), 

v  m 

where  £  >  y2»  We  want  to  apply  the  SPRT  procedure  to  decide  which  of 

these  two  states  is  present. 

The  conditional  probability  distribution  for  the  observed  random 
variable  z  is 


in 


R.  *  tofc(j)  *  E  to 
3  tol 


a  ul£  3  i  f_,*t‘hl2t 

u  r<u2|t)r<z^*J+ult)<o2£,+3nT) 


a2i  2r(uii>  r<|.1't+u2i)  <<,U+)nr),.'**Ult 


. (267) 


The  SPRT  procedure  is  then: 


choose  «  if  R  >  dn  A, 
Ik* 

choose  ^  if  R^  <  *n  A, 


continue  testing  if.Kn  B  <  R.  <tn  A. 

i 


Consider  the  single  cell  case  (i.  e. ,  m  *  1)  where  y^,  y 2>  9,  z,  and  k 
are  all  scalars.  Assume  that  u^  =  ^  »  u.  For  these  conditions  we  have 


(268) 


Also, 


h  '  W  ‘ 


J"1  i  r  /«2+jnT\  /o.+jriT-nt  \1 

(I  z1-*)  to  ~~T-  ) 

i-1  L  v“i+Jr,T/  \a1+jriT-i,iT  / 


/cu+jnr\ 


(269) 
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We  can  now  apply  the  SPRT  procedure  to  discriminate  between  the  two 
states  of  nature,  and 

We  now  attempt  to  find  the  ASN,  from 


EOt^)  -  L(y)  tnB  +  [l-L(y)]  In  A. 

If  we  assume  that  E(R^)  **  EiR^)  we  can  write 
/<x»+knT\ 

E(R^)  »  tn(a  [kmE(y)+u]  -  u  In  (c^/c^). 


(270) 


(271) 


Thus  we  have 


Dt?VrE(y)+u]  fcn 


a,+knt\ 

~^-j»L(y)  £n  B  +  [l-L(y)]  £nA  +  u  tafog/Oj)  (272) 


for  general  signal  values  y, 


,o.+knt\ 

[icnTu/Uj+u]  *  8  fcn  B  +  (1-8)  £n  A  +  u  fcn  (Og/ttj) 


(273) 
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for  y^  present^  and 
[En  tu/a^+u]  An 
for  y^  present. 

These  last  two  equations  must  be  solved  by  a  "cut  and  try"  procedure 

(or  graphically)  to  obtain  k  (ASN),  For  general  signal  values,  y,  L(y)  is 

very  difficult,  if  not  impossible  to  obtain  analytically. 

Figures  36,  37,  and  38  sh.)w  graphical  solutions  of  ASN  for  the  case 

of  y^  present  for  both  the  two- known- signals  case  and  the  two-unknown- 

signals  case.  It  can  be  seen  from  these  figures  that  for  the  two-unknown- 

signals  case  and  small  enough  values  of  a  and  p  ,  given  some  value  of 

R.  (R.  »  '(R./y*  j ^neither  threshold  will  be  crossed  regardless  of 
J  J  J 

how  large  ASN  becomes. 

For  general  signal  values  y  the  ASN  and  L(y)  can  be  found  by  computer 
simulation.  Figure  39  shows  the  simulated  ASN  versus  actual  signal 
values  y  for  the  two-unknown- signals  case  that  we  have  just  considered. 
Simulated  and  calculated  ASN  for  the  sequential  detector  designed  for 
two  particular  signal  mean  rates  are  also  shown  for  comparison. 

Figure  40  shows  the  CCF  for  these  cases.  One  thousand  trials  were 
made  for  each  of  the  simulated  results  of  Figures  39  and  40. 


c 


a2+kn 


otj+F.n 


khtv 

f^)m  a  in  A  +  (l-a)in  B  +  uinfog/a^) 


(274) 


Calculated  mean  cumulative  information  versus  sample 
number  for  sequent:  '  detectors  designed  for  the  two* 
known-signals  case  and_the  two-unknown  signals  case 
for  u  =  1000.  y,  =  20.  y  ,  =  10.  and  y  *  20. 


•  two  unknown  signals  ( simulat< 
O  two  known  signals  (calculated 
x  two  known  signals  (simulated) 


Figure  39.  Computer-simulated  ASN  versus  actual  signal  values 
v  for  a  sequential  detector  designed  for  two  unknown 
signals  and  compared  with  calculated  and  simulated 
ASN  of  the  two -known -signals  case  for  y.  =  20»  y„  »  10 
a  =  6  a  10"^.  and  u  =  1000. 
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One  Unknown  Signal  and  One  Known  Signal 

Assume  that  at  the  input  of  the  detector  we  have  either  the  known 
signal  /2  or  the  unknown  signal  We  have  as  the  two  states  of  nature: 

y  -  Yj  (unknown), 

y  =  ^2  (known), 

where  >  y 2*  We  again  consider  this  case  because  of  its  advantages 

in  analyzing  the  detection  error  from  which  we  can  gain  additional  insight 
into  the  detection  process.  The  unknown  signal  y^  is  assumed  fixed 
but  unknown  and  having  each  of  its  elements  taken  from  statistically 
independent  gamma  distributions  with  known  parameters: 


_  n>  n  ai , 07i . 

f(yx)  -TT  f(y.,)  -TT  ^ - ’*i£o 

4-1  Ll  £-1  r(u1£) 


(275) 


■  0,  otherwise. 

The  conditional  probability  distribution  for  the  observed  random 
variable  z,  given  and  y^,  is 


V2Vi)  ■  ^  ([j 


i 

7..  -nry 


1£ 


£  : 


(276) 
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We  can  now  use  this  expression  in  the  SPRT  procedure  to  discriminate 
between  the  states  of  nature  ^  and  w  , 

Now  consider  the  single  cell  case  where  y^t  y^,  y,  a,  and  k  are 
scalars.  For  the  single  cell  case  we  have 


V  r  «/-i>  +  '  *"  r<ul) 


(280) 


Using  the  approximation 


£nr  (xfX)*  %£n2ir-x  +(x**j)  in  x 


(281) 


we  can  write 


1  i  i  * 

R.  *  (r  *  +U.-JS)  in  (I  s  "Hi, “1)  -  (u  -jj)  in  (u,-l) 
J  i»l  i»l  *  i  i 


(282) 


+  u. 
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and 


(283) 


To  solve  for  ASN  and  OCF  analytically  appears  extremely  difficult 
in  this  case.  We  will  consider  only  computer-simulated  solutions. 
Figure?  41  and  42  show  respectively  the  simulated  ASN  and  OCF  versus 
actual  signal  valu*  ?  ~  for  a  detector  designed  for  one  signal  with  known 
mean  rate  and  one  signal  with  unknown  mean  rate.  Figure  43  shows, 
for  comparison,  the  ASN  for  the  two-known- signals  case,  the  two- 
unknown- signals  case,  and  the  case  just  considered.  It  can  be  seen  from 
this  figure  how  the  ASN  increases  as  less  is  known  about  the  signals. 

Two  hundred  trials  were  made  for  each  of  the  simulated  results  of 
Figures  41  and  42. 


4 


i 

* 


Figure  42.  Computer -simulated  OCF  versus  actual  signal  values 

y  for  a  sequentail  detector  designed  for  one  known  signal 
y.  and  one  unknown  signal  y,  where  y .  =  20  and  y,  =  10. 


Figure  43.  Comparison  of  simulated  ASN  versus  actual  signal  y 

for  the  sequential  detectors  designed  for  the  two-known 
signals  case,  the  two-unknown -signals  case  and  the 
case  of  one  known  signal  and  one  unknown  signal  where 
y,  »  20.  y,  =  10.  u  =  1000.  and  a  -  8  -  10"^. 
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Unknown  Signal  Imbedded  in  Known  Noise 

Consider  the  case  where  we  have  one  unknown  signal  s  which,  if 
present  at  the  input  of  the  detector,  is  imbedded  in'  known  noise  n.  The 
two  states  of  nature  are: 

y  =  s  +  n  (unknown  signal  plus  noise), 
y  =  n  (noise  alone). 

The  signal  s  is  assumed  fixed  but  unknown  and  having  each  of  its 
elements  taken  from  statistically  independent  gamma  distributions  with 
known  paiameters: 


f(s) 


fr  fu.)  ..ft  »«"A>vl  «~Vt 

t-i  £• i  r(ua) 


3l>o 


(284) 


=0, 


otherwise. 


The  conditional  likelihood  ratio  is 


*  (J/i) 


s) 

P^TT^) 


(285) 
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The  likelihood  ratio  becomes 


*{j)=  ^a(j/s)*(s)ds 

b 


^  *1*  Jl  i\ 

r1  Wl[n 

ksO  x  k  lftj 


<ifJ’fO]kr( +  ut-k) 


x286) 


i  1 


r(ul)[nJl(8Jl+jnV)3isl 


Taking  the  logarithm  of  this  equation  and  assuming  the  single  ceil  case 
we  have 


R.  a  u  ta 
) 


»( 


JL 


P  +  j»V 


-tn  r{u)  +tn  [(S  z*)l]-{s  z^nfn'p+jtyr)] 
7  i=l  i=l  / 


+  *n 


j  i 
riiiz 
z 

Lk=0 


[n(P  +  jr;T)]kr(S  z1  +  u-k) 

(S  zi  -  k)l  kl 
i=l 


(287) 


This  expression  can  then  be  used  in  the  SP^  procedure  to  discriminate 
between  the  states  of  nature,  o> ^  end  w 

An  exact  analytical  solution  of  ASN  and  OCF  for  this  case  appears 
extremely  difficult,  if  not  impossible.  It  appears  that  the  only  reasonable 
method  of  analysis  of  this  case  is  by  computer  simulation. 
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Figure  44  shows  the  simulated  ASN  versus  actual  signal  values  y 
for  a  detector  designed  for  a  signal  with  unknown  mean  rate  imbedded 
in  Poisson  noise  with  known  mean  rate,  Figure  45  shows  the  correspond¬ 
ing  OCF.  Two  hundred  trials  were  made  for  each  of  the  simulated 
results  of  Figures  44  and  45. 

Information  Content  of  Samples 

An  important  property  of  the  sequential  detector  is  the  informal. on 
content  of  the  samples.  We  will  consider  the  mean  information  of  a 
sample  to  be 

l  *  Etf./y,  j)  =  E(R  /y,  j)  -  E(R  /y,  j).  (288) 

J  J  «l  J 


The  information  is  favorable  to  the  hypothesis  that  <0^  is  the  true  state  of 
nature  when  the  abov;  expression  is  positive.  When  the  information  is 
negative,  the  sequential  detector  tends  to  choose  u>_  as  the  state  of 

fa 

nature.  We  will  determine  the  amount  of  information  provided  by  the 
samples  of  the  various  sequential  detectors  that  we  have  previously 
considered. 

For  the  detector  designed  for  two  known  signals,  the  mean  infor¬ 
mation  per  sample  is  constant  regardless  of  the  actual  signal  y  which 
may  be  present  and  is  given  by 


Figure  44.  Computer -simulated  ASN  versus  actual  signal  values 
y  for  a  sequential  detector  designed  for  an  unknown 
signal  s  which  if  present  is  imbedded  in  known  noise 
n  where  5=10*  n  =  y,  =10*  y.  =  20.  and  u  =  10. 


OCF 
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Figure  45.  Computer -simulated  OCF  versus  actual  signal  values 
y  for  a  sequential  detector  designed  for  an  unknown 
signal  s  which  if  present  is  imbedded  in  known  noise 
n  where  5  s  10?  n  =  =  10?  y^  =  20?  and  u  =  10. 
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L  —  t|T  [ y  m  (yj/y2)  -  yj  +  y2J» 


(289) 


The  expected  cumulative  information  for  the  j  sample  is 

Rj  =  E(R7y,  j)  =  tjTj  [y  *n  (yj/y^  -  Y|  +  y2l  <29°) 

where  the  true  mean  rate  is  y.  Figure  46  shows  the  mean  cumulative 
information  versus  sample  number  for  this  case  where  the  detector  is 
designed  for  y =  10  and  for  various  values  of  y^.  The  positive  values 
result  when  u>j  is  the  state  of  nature  (?  =  y  j),  and  the  negative  values 
result  when  Is  the  state  of  nature  (y  ~  y^). 

Whenever  either  or  both  of  the  two  signals  have  unknown  mean  rates, 
the  information  per  sample  is  no  longer  constant.  For  the  detector 
designed  for  one  unknown  signal  y^  and  one  known  signal  y^  the  mean 
cumulative  information  for  the  j**1  sample  is 


(291) 
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Using  the  approximations  that 


E(R,(i)/y,  j)  *  R.(S) 

J  J 


(292) 


and 


jn.r(x+l)  **  j  fcn2ir  -  x  +  (x+i)  tnx 


(293) 


we  can  write 


Rj  “  thTy  +  *n  OVy  +  Uj-iKuj-1)  tn^-l)  +  uttn 


+  jr,Ty2-jiiTy  -  jrjTy  &i  [y^  +  jiVT)] 


and 


_  _  .  /jt)Ty  +  u-l  v 

Ij.O’lTy-^ry  +  ut -I) y  jnry  +  «,-l) 

-  -  /  «l+jnT  \ 

■jTlTy(jriTy~TiTy)  tn  "W  **^2^1  +  jTVr)] 


/  al+M  ^  . 

■u,  t.n  (  — -T— — — *]  +  tyry., 
1  \,Qfi+J''lT’T1T/  2 
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The  limit  of  i.  as  j  approaches  infinity  gives  us  an  indication  of 

j 

what  the  detector  is  doing  as  the  sample  number  becomes  large.  The 
mean  information  per  sample  provides  us  with  a  measure  of  the  edaptive 
capability  and  the  performance  of  the  detector.  It  provides  some  insight 
into  the  spt  t  i  by  which  the  detector  will  make  a  decision.  The  limit  of 
Rj  as  j  approaches  infinity  gives  us  some  upper  bound  to  the  information 
that  we  can  attain  and  may  point  out  some  limitations  of  the  detector. 

For  the  case  of  the  detector  designed  for  one  unknown  signal  y^  and  one 
known  signal  the  limit  of  the  mean  cumulative  information  as  j 
approaches  infinity  is 


lim  R.**u  ena  y-a  y-(u  tnOi. -l)+(u  -1)+ lim 

j-**oo  J  1  1  1  1  1  1  j^oo 

-a^n(jnTy+u1“1>]  •  (296) 


The  limit  of  the  mean  information  per  sample  as  j  approaches  infinity  is 


lim  f.  =  T)T[y  *.n(y/y2)  -  y  +  y^].  (297) 

j-^oo  J 

For  Uj  as  the  state  of  nature  the  detector  initially  does  not  know  the 
true  mean  rate  of  the  signal  and  the  mean  information  per  sample  provided 
about  the  state  w  is  small.  As  more  samples  are  taken,  the  detector 
adapts  to  y^  and  thus  becomes  more  effective.  As  the  sample  number 


[  j«T[y  £n(y/y2  )-y+y2] 
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becomes  large,  the  detector  learns  y^,  and  then  the  mean  information 
per  sample  becomes  the  same  as  for  the  case  when  the  signal  was  known 
beforehand.  If  the  state  of  nature  the  detector  is  most  effective 

to  begin  with,  and  as  more  samples  are  taken  the  detector  performance 
begins  to  deteriorate  since  the  mean  information  per  sample  decreases 
and  in  the  limit  the  mean  information  per  sample  approaches  zero.  The 
mean  cumulative  information  approaches  infinity  as  j  approaches 
infinity  when  either  Wj  or  u  ^  is  the  state  of  nature  but  the  mean 
cumulative  information  for  state  approaches  infinity  faster  than  for 
state  w^.  The  mean  cumulative  information  versus  sample  number  for 
this  case  is  shown  in  Figure  47.  The  detector  for  this  case  is  designed 
for  a  known  signal  with  a  mean  rate  of  y  =  10  and  for  an  unknown  signal 

b 

with  various  expected  mean  rates  y^.  The  positive  values  result  when 
w  j  is  the  state  of  nature  (y  =  y^)  and  negative  values  result  when  is 
the  state  of  nature  (y’  =  y^). 

For  the  case  of  one  unknown  signal  s  imbedded  in  known  Poisson 
noise  n,  a  similar  result  holds.  For  this  case  when  is  the  true  state 
of  nature  the  detector  approaches  the  two -known- signals  case  faster  than 
does  the  detector  of  the  case  just  discussed  where  one  signal  is  known 
and  one  signal  is  unknown.  Also,  when  *3  the  true  state  of  nature, 
the  detector  does  not  deteriorate  as  rapidly.  The  mean  cumulative 
information  versus  sample  number  for  this  case  is  shown  in  Figure  48. 
The  detector  is  designed  for  known  noise  n  =  y^  -  10,  and  several  values 


47.  Calculated  mean  cumulative  information  versus  sample 
number  for  a  sequential  detector  designed  for  one  known 
signal  and  several  mean  values  of  the  unknown  signal 
y.  where  y  =  10.  y  =  y.  and  u  =  10. 


Calculated  mean  cumulative  information  versus  sample 
number  for  a  sequential  detector  designed  for  the 
unknown  signal  s  which  if  present  is  imbedded  in  known 
noise  n  where  y_  =  n  =  10>  y  =  y,  =  s  +  n>  and  u  -•  10. 
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of  unknown  signal  s  or  y^  =  5  +  n.  Positive  values  result  when  is 
the  s.  te  of  nature  (y  =  y^),  and  negative  values  result  when  is  the 
state  of  nature  (y  =  y^). 

When  we  have  the  two -unknown -signals  case,  the  mean  information 
per  sample  for  the  j  sample  is 


.  _  J  /"2+jT|T\  /a2+j 


«2+jllT-^T\l 


lpT-iyr 


+  tjTy  f  n 


/V*r\ 


(298) 


and  the  mean  cumulative  information  for  the  j  sample  is 

-  /ar2+jT1‘!\  /0r2+jT|T\ 

RJ 5  jnTy*"Vvi^ +  u  £  7^r) '  (2,9> 

For  this  case  the  detector  is  most  effective  to  begin  with,  regardless 
of  the  true  state  of  nature,  and  as  more  samples  are  taken,  the  detector 
performance  begins  to  deteriorate  because  of  ti'e  decrease  in  the  mean 
information  per  sample  and  in  the  limit  the  mean  information  per  sample 
approaches  aero.  The  limit  of  the  mean  cumulative  information  approaches 
some  constant  depending  or.  the  actual  signal  present  and  is  given  by 


lim  R  =  y  “  u  ^(o^/o^). 

j-*  oo  J 


(300) 
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Figure  49  shows  the  mean  cumulative  information  versus  sample  number 
for  this  case  when  the  detector  is  designed  for  an  unknown  signal  with  an 

m 

expected  mean  rate  of  =  10  and  for  an  unknown  signal  with  various 
expected  mean  rates  y^.  The  positive  values  result  when  is  the  state 
of  nature  (y-  =  y^)  and  negative  values  result  when  ^  is  the  state  of 
nature  (y  =  y^). 

Figure  50  shows  the  mean  :nformation  per  sample  versus  sample 
number  for  the  two-known- signals  case  and  the  two-unknown-signals 
case.  The  positive  information  is  obtained  when  is  the  state  of  nature 
(y  =  y^),  and  negative  information  is  obtained  when  w  is  the  state  of 
nature  (y  =  y^).  Figur*»  51  shows  the  calculated  mean  cumulative  infor¬ 
mation  versus  sample  number  and  also  a  computer- simulated  sample  of 
the  cumulative  information  for  a  detector  designed  for  two  known  signals 
when  u j  is  the  state  of  nature  (y  =  y^).  Also  shown  in  Figure  51  is 
the  computer- simulated  mean  of  the  cumulative  information  versus  sample 
number.  This  simulated  sample  mean  compares  very  favorably  with  the 
calculated  values.  Figure  52  shows  the  results  for  the  same  detector 
when  w  is  the  state  of  nature  (y  =  y  ).  Figures  53  and  54  show  results 

fa  fa 

that  correspond  to  Figures  51  and  52  respectively,  for  a  detector  designed 

for  two  unknown  signals.  Figure  55  shows  the  computer- simulated  mean 

cumulative  information  versus  sample  number  for  a  de'.ector  designed 

for  two  known  signals,  y  =  20  and  y  =  10.  and  for  various  values  of 

X  £• 

actual  signals  present.  Figure  56  shows  the  computer-simulated  mean 


Figure  50*  Calculated  mean  information  per  sample  versus  sample 
number  for  a  sequential  detector  designed  for  the  two- 
known -signals  case  ^and  the  two -unknown -signals  case 
where  y  =  20,  and  y  a  IQ. 


4  calculated  mean 
e  simulated  sample 
x  s:mulated  mean 


Figure  51.  Calculated  and  simulated  mean  cumulative  information 
versus  sample  number  and  a  simulated  sample  of  the 
cumulative  information  versus  sample  number  for  a 
sequential  detector  designed  for  two  known  signals 
when  is  the  state  of  nature  and  =  20.  y  =10. 
and  y  =  20. 


Figure  52.  Calculated  and  simulated  mean  cumulative  information 
versus  sample  number  and  a  simulated  sample  of  the 
cumulative  information  versus  sample  number  for  a 
sequential  detector  designed  for  two  known  signals 
when  u>_  is  the  state  of  nature  and  y  =  20*  y.  =  10. 
and  y  =10.  1  2 


a  calculated  mean 
•  simulated  sample 
n  simulated  mean 


Figure  53-  Calculated  and  simulated  mean  cumulative  information 
versus  sample  number  and  a  simulated  sample  of  the 
cumulative  information  versus  sample  number  for  a 
sequential  detector  designed  for  two  unknown  signals 
when  is  the  state  of  nature  and  =  20.  »  10. 

y  =  20.  and  u  =  100. 
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Figure  55.  Computer -simulated  mean  cumulative  information  versus 
sample  number  for  a  sequential  detector  designed  for 


two  known  signals,  =  20  and  =  10.  and  for  various 
values  y  of  actual  signal  present. 
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Figure  56.  Computer-simulated  mean  cumulative  information  versus 
sample  number  for  a  sequential  detector  designed  for  two 
unknown  signals  and  for_various  values  y  of  actual  signal 
present  where  y  j  =  20,  y^  -  10,  and  u  =  100. 
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cumulative  information  versus  sample  number  for  a  detector  designed 

for  two  unknown  signals,  with  mean  rates  =  20  and  y^,  =  10,  and  for 

various  values  of  actual  signals  present.  Figures  36,  37,  and  38  show 

the  calculated  mean  cumulative  information  versus  sample  number  for 

different  values  of  u.  These  figures  can  be  used  to  find  graphical 

solutions  of  ASN  for  the  case  of  y^  present  (  as  the  state  of  nature). 

Two  hundred  trials  were  made  for  each  of  the  simulated  results  of 

Figures  51,  52,  53,  54,  55,  and  56. 

To  simulate  the  mean  cumulative  information,  it  was  necessary  to 

find  the  sample  means  I.,  I*,  .  „ . ,  I.  (i.  e. ,  averages  of  200  samples). 

12  j 

These  sample  averages  were  then  added  to  yield  the  simulated  mean 
cumulative  information. 


Savings  of  ASN 

One  of  the  advantages  of  sequential  detection  over  fixed-sample 
detection  is  in  the  savings  of  the  average  number  of  samples  needed  to 
achieve  a  specified  reliability  of  decision.  We  are  interested  in  making 
some  comparison  of  these  two  detectors  to  obtain  an  idea  of  the  difference 
of  ASN. 

Figure  57  shows  the  calculated  ASN  of  the  two-known-signals  case 
versus  error  probability  for  the  fixed-sample  detector,  where  p{  Qj)  = 
p(  <*>  ^)  =  1/2,  and  the  sequential  detector  for  both  w  and  as  the  states 
of  nature.  This  figure  shows  that  for  the  particular  example  considered. 


i 


f 


4 


•  fixed-sample  detector 

x  sequential  detector  (titate  w  .) 

0  sequential  detector  (state  «  ) 


Figure  57.  Calculated  ASN  for  the  two-known- signals  case  versus 
error  probability  for  the  fixed-sample  detector  (p(w^) 
p(w^)  =  1/2)  and  the  sequential  detector  for  both  states 
of  nature  w,  and  «_  where  y,  =  12  and  y_  *  10. 
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the  sequential  detector  has  about  a  40%  savings  of  ASN.  Figure  58  shows 
the  co  nputer-simulated  results  corresponding  to  Figure  57.  The 
simulated  results  show  that  for  small  sample  numbers  the  fixed-sample 
detector  actually  does  better  than  the  sequential  detector,  but  as  the 
sample  number  becomes  large  the  sequential  detector  is  substantially 
better  than  the  fixed-sample  detector.  Figure  59  shows  the  computer- 
simulated  results  corresponding  to  Figure  5C  except  that  the  sequential 
detector  is  designed  for  two  unknown  signals.  Four  hundred  trials  were 
made  for  each  of  the  simulated  results  of  Figures  58  and  59. 


SUM-  ARY  AND  CONCLUSIONS 


In  this  paper  the  problems  of  estimating  and  discriminating  between 
extended  optical  signals  which  have  been  distorted  by  diffraction,  additive 
background  noise,  and  multiplicative  noise  have  been  studied.  Poisson 
statistics  have  been  assumed  throughout  this  paper  since  in  many  real 
situations  they  are  more  realistic  than  Gaussian  statistics  which  art.  more 
commonly  used. 

For  the  estimation  problem,  an  optimum  linear  estimate  using  the 
minimum  mean-square-error  criterion  was  r<~>  sidered.  Detection 
noise  (multiplicative  noise)  as  well  as  additive  noise  was  considered 
since  for  any  measurement  technique  there  will  be  some  interaction 

between  photons  and  matter  which  in  turn  will  give  rise  to  detection  noise. 

* 

The  performance  of  the  minimum  mean-square-error  estimation  procedure 
was  evaluated  for  several  special  cases.  Som  results  pertinent  to 
optimum  sampling  schemes  were  obtained  for  both  white  and  colored 
noise. 

Other  estimates  besides  the  minimum  mean-square-error  estimate 
were  considered.  These  other  estimates  included  the  Bayes'  estimate, 
the  maximum  likelihood  estimate,  and  the  maximum  a  posteriori  estimate. 
These  estimates  were  considered  primarily  to  find  estimates  of  the  mean 
rate  of  signal-plus-noise  photons  y  incident  upon  the  image  plane.  These 
estimates  were  then  used  in  an  estimator-correlator  in  conjunction  with 
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fixed-sample  detection.  Given  that  is  taken  from  a  gamma  distri¬ 
bution  with  known  parameters,  the  minimum  mean-square-error 
estimate  and  the  Bayes*  estimate  of  y^  were  found  to  be  the  same. 

The  Bayes*  decision  rule  was  used  in  the  fixed-sample  detection 
procedure  to  discriminate  between  extended  optical  signals.  Only 
problems  involvii  g  two  possible  states  of  nature,  and  Wj,  were 
considered.  Various  amounts  of  a  priori  information  were  assumed  about 
the  two  possible  signals  which  may  be  present  at  the  input  of  the  detector. 
In  the  error  analysis  only  a  single  cell  of  the  image  plane  was  considered. 
For  cases  of  unknown  signal  parameters  (mean  rates),  the  parameters 
were  assumed  fixed  but  initially  unknown  having  been  taken  from  a  known 
gamma  distribution.  Several  estimator -correlators  were  considered. 

The  Bayes*  and  minimum  mean -square -error  estimator -correlators 
were  found  to  be  superior  to  the  maximum  a  posteriori  estimator- 
correlator  for  the  cases  considered.  It  was  found  in  general  that  if  the 
error  probabilities  are  greater  than  about  0.  1,  knowing  the  signal  para¬ 
meters  (mean  rates)  is  not  much  more  of  an  advantage  than  knowing  only 
the  probability  distributions  from  which  they  were  taken,  particularly 
if  the  Bayes*  decision  rule  or  the  Bayes'  estimator-correlator  is 
used.  For  error  probabilities  «  .  1  the  above  statement  does  not 
hold  and  the  difference  in  error  probabilities  becomes  significant. 

Y/ hen  speed  is  important  in  processing  optical  data,  sequential 
detection  should  be  considered.  The  sequential  detection  test  developed 
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by  Wald  (1947)  minimizes  the  average  test  length  and  hence  has  a  shorter 
average  test  length  than  the  fixed-sample  test  for  a  given  P(FA)  and 
P(FD).  A  comparison  was  made  of  the  test  length  of  these  two  detectors 
to  obtain  some  insignt  into  the  savings  of  time  that  the  sequential  detector 
has  over  the  fixed-sample  detector.  Sequential  detectors  were  derived 
for  various  cases  of  known  and  unknown  optical  signals  and  their 
performances  compared.  For  the  case  of  a  sequential  detector  designed 
for  two  known  signals,  the  information  per  sample  remained  constant 
for  all  samples.  It  was  found  that  if  only  one  of  the  signals  for  which  the 
-  equential  detector  was  designed  were  unknown  and  if  that  particular  state 
of  nature  (  o  were  the  state  of  nature  present,  the  information  per 
sample  is  smallest  for  the  first  sample  and  as  more  samples  are  taken 
the  information  per  sample  increase  and  approaches  the  constant 
information  per  sample  of  the  two -known -signals  case.  The  performance 
of  the  sequential  detector  for  these  cases  thus  improves  as  more  samples 
are  taken.  It  was  also  found  that  if  the  other  state  of  nature  (  n  )  were 
present  the  information  per  sample  is  greatest  for  the  first  sample  and  as 
more  samples  are  taken  the  information  per  sample  decreases  and  thus 
the  performance  of  the  sequential  detector  deteriorates  as  more  samples 
are  taken.  For  the  case  of  the  sequential  detector  designed  for  two 
unknown  signals,  the  information  per  sample  is  greatest  for  the  first 
sample  and  decreases  as  more  samples  are  taken  regardless  of  which 
state  of  nature  is  present.  The  average  sample  number  and  Operating 
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Characteristic  Function  (probability  of  choosing  state  when  signal 
y  is  present)  versus  actual  signal  y  were  determined  for  the  cases 
mentioned  above.  For  most  of  the  cases  considered,  analytical  solutions 
appeared  extremely  difficult  to  obtain;  hence,  the  analysis  was  to  a 
large  extent  carried  out  by  computer  simulation. 

For  future  work  in  this  area,  it  is  recommended  that  the  discrete 
sequential  detector  that  i  as  been  discussed  in  this  paper  be  considered 
in  terms  of  a  continuous  sequential  detector  where  only  one  observation 
is  made  and  the  duration  time  of  the  observation  is  varied.  Also, 
multiple  detection  should  be  considered  for  cases  with  more  than 


two  states  of  nature 


APPENDIX 
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Two  Dimensional  Optical  Transform  Theory 

Imaging  Configuration 

We  will  use  Huyghen's  principle  in  this  development.  According  to 
Huyghen's  principle,  each  point  (  a,  p  )  acts  as  the  source  of  a  wavelet 
that  propagates  with  the  Green's  function  e  /r  for  radiation  from  a 
point  source  where  r  is  the  distance  from  the  source  and  k  is  equal  to 
Z  itv  /c  (c  is  the  speed  of  light  and  v  is  the  frequency  of  the  radiation). 

The  electric  field  at  the  aperture  is  then  found  by  adding  the  vector 
amplitudes  of  all  the  wavelets  originating  at  the  object  plane.  The  strength 
of  each  wavelet  depends  on  the  complex  amplitude  of  the  electric  field 
in  the  object  plane. 

We  again  employ  Huyghen's  principle  just  to  the  right  of  the  aperture. 
Each  point  { p/y)  of  the  aperture  acts  as  the  source  of  a  secondary  wave- 
let  that  again  propagates  with  the  Green's  function  e  It.  The  total 
elec'.ric  field  at  the  image  plane  is  found  by  adding  the  vector  amplitudes 
of  all  the  wavelets  originating  at  the  aperture  plane  where  the  strength 
of  these  wavelets  depends  on  the  incident  electric  field  at  the  points  of 
the  aperture. 

Let  E(or,  p)  be  the  complex  amplitude  of  the  monochromatic  source 
point  in  the  object  at  point  ( a,  p  ).  Using  Huyghen's  principle,  the  electric 
field  just  to  the  left  of  the  aperture  is 


1 85 


E(a,P) 


ikR, 
e  1 


We  are  assuming  Fraunhofer  diffraction;  hence, 


R.+R10  “  2R,n  ”  2R- 


10 


and 


R2  +  R20  *  2R20  3  2R* 


Rx  *  R. 


We  can  rewrite  (301)  as 


_ .  .  ikR  * 

E(or,  p)  e 


„2  _  2  .  .2  ,  ,.2  _  2  2  2 

R  =  Rj  -(<r-a)  -(Y“P)  =R!o  * 


r2  =  r22-(<tH)2-(y+u2=  R202-e2-^2. 


R,  -R. ,,  = 


1  10  Rj+Rjq  Rj+Rjq 


{<ra  +  Yp), 


(301) 


(302) 


(303) 

(304) 


(305) 


(306) 


(307) 
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106 


.  2  2,  2 

where  p  =  <r  +7 


R2 " R20 = 


R2+R20 


R2+R20 


(<t£+yO. 


(309) 


Then 


Rl-R10  = 


2 

P 

2R 


R  * 


(310) 


2 

VR20  *  2R  + 

R 


(311) 


For  small  aperture,  p  /2R  can  be  neglected. 


Hence, 


VR10 


-(<**  -KV3) 


R 


(312) 


and 


R2"R20 


R 


(313) 
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Let  f  sff/vR  and  f  =  Y  / v  R.  We  now  can  rewrite  (305)  as 
< r  y 


1»1#  lk(Rl-Rl0) 

E(a,  (3)e  e  =  E (a,  p)e  e 


(314) 


xkR,  A  -i2ir(f  a+f  p). 
/»»  10  <r  v 

=  E(a,  P)e  e  ' 


Coherent  Light 

Since  RiQ  is  approximately  constant  over  the  object  pattern,  we  can 

ikRi  a 

neglect  e  i0  or  absorb  it  into  E(  «,  p  )  since  it  is  only  a  constant 
phase  term,  and  we  cannot  observe  phase  but  only  intensity. 

The  electric  field  at  point  ( <r,  Y)  just  to  the  left  of  the  aperture  plane 
is  the  sum  of  all  contributions  at  that  point  due  to  all  the  points  in  the 
object  plane.  This  is  true  because  amplitudes  add  for  coherent  light. 


Wtt 


*  -i2ir{f  «+f  (J) 

y  *  Jj[E(a,  p)  e  *  7  dadp. 


(315) 


Let  T(f^  f  )  represent  thi  aperture  function.  The  electric  field 
at  point  ( <r,  Y )  just  to  the  right  of  the  aperture  plane  is  then 


u<vvaw(vvT(vy 


(316) 
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Again  using  Huyghen's  principle,  the  electric  field  at  ( t, )  in  the 
image  plane  due  to  the  point  ( o',  y)  in  the  aperture  plane  is  proportional 


U(fT.  V  ^  eik[p2'2R+  (<r4+'ft)/Rl. 


(317) 


Neglecting  the  term  p  /2R  for  small  aperture  and  neglecting  the  constant 
ikR 

phase  term  e  20  We  can  write 


(318) 


Now  the  total  electric  field  at  the  point  (  £  )  in  the  image  plane 

due  to  all  the  points  in  the  aperture  plane  is 


u(5.U  -  /jP  u((T.fY) 


(319) 


This  is  the  inverse  Fourier  transform  of  U(f  ,  f  ),  i.e. , 

or  7 


F  (u(|,  U)  =  iy)  =  W( tc.  ty)  T(fffJ  y. 


(320) 


189 


Using  the  convolution  theorem  we  can  write 


(321) 


where 


"  I2ir(f  6+f  t) 

t(e.»=JjT(£<r.<v)e  y  ■tt.dfy  (322) 

"too 

is  the  impulse  response  of  the  system. 

In  the  image  plane,  intensity  is  the  quantity  that  is  actus  observed 
and  not  the  amplitude.  That  is,  we  observe 


-  v(£,£). 


(323) 


incoherent  Light 

For  incoherent  light,  intensities  rather  than  amplitudes  add. 

W-  will  now  find  the  expression  for  the  intensity  at  (§ ,  £  )  due  to  a  point 
source  intte  object  plane  ( <*,  £  )  and  then  integrate  over  the  object  plane. 

In  thi3  case  we  can  take  advantage  of  the  previous  discussion  and  write 
for  the  electric  field  at  point  (<r,7  )  just  to  the  right  of  the  aperture  the 
expression 
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ikR 


,0 

e 


+  £  <3> 
T 


(324) 


where  the  subscripts  ct  end  p  infer  that  this  expression  is  due  to  the  point 
source  at  point  (  a>P  ).  'he  electric  field  at  (£,  £  )  due  to  the  electric 
field  at  (o^y  )  is 


U  fl{£  ,£  )e 
a,  P  <r  y1 


ikR2Q  ik(<rg+yC)m 


f 

(325) 


£-*ef'  s<r/vR  and  f  =Y/»»R.  Then  we  have 


Ua,  li,!a’  fy)'=lkRel2’,<£<r5+fY; >  ,  T(£,.f„)E(«,?). 


lk<R10+R20> 


(326) 


.ei2*<y§-«]  +  ^-p]>. 


The  total  electric  field  at  point  (  £,  £  )  due  to  the  point  source  at  point 
{  at,  P  )  is 


ik(R10+R2C?  ft 


I  f  T(fT,  ty)E{<*,  |3)e;,"<fa!^J+£7R-p]) 

(327) 


df  df  . 
<r  y 
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The  intensity  (the  observable)  at  (£,  r,  )  due  to  the  point  source  at  point 
(or,  p)  is 


(328) 


To  find  the  totai  intensity  at  (£,  4  )  due  to  all  the  point  sources  in  the 
object  plane  we  must  integrate  over  the  object  plane.  That  is. 


v<l,u  *  lu(^>l2  '(fffa.ty)  w  (129) 


.“**■«?>  jwVj*  m  , 

f  1  <r  y  (r  y 


<r  e 


where 


W(£<r-CfT-V=./yiE(“-P)|2« 


•i2ira(f  -f  )  -i2irp(f 
<r  (r  c  r 


i 

Y7dadp. 


(330) 


|e(«.p  )!2  = 


Now  let 


w(  ct,  p  )  and 
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t  i  i 

*  “  Vf<r'  t*f»**'  ’“o'  =  'dI’ 
y  * \\  \  =  <y,-y.  ^  (33D 


Then 


00 


y(i,U  *  Ht.l)|  -  /f  ff  Wf’V  T^-M^-y)  e 


i2ir(£z+£y) 


06  “00 


•  w(sfi|  y)  df^df^dz  dy 


(332) 


00 


.  JTw(».y )e«»«*«y)  jf 


00 


T(f  ,£  )  T*(f  -z,f  -y)df  df 
ir  *y  <r  y  <r  y 


00 


dz  dy. 


Let 


00 


A(.,y)  =  jQT  *  <f  ty- 


■y> 


00 


(333) 


A(z.y)=  T(fr,y  *T*y,y  . 


(334) 


a  =  F  {A  )  =  F  { T*T  }  * 


(335) 
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v<4,  U  =  jjktz,  y)  W  (*.  y)  +  dzdy. 

-oo 


and 


Y  cr  y' 


The  quantity  f  )  is  the  transfer  function  for  the  incoherent 


light  case.  Using  the  convolution  theorem  we  can  write 


v(4,H«a(!,0*w(|,&) 


where  a(  £  )  is  the  point  spread  function  of  the  optical  system. 


Derivation  of  Point  Spread  Functions 


The  Iburier  transform  pair  in  two  dimensions  is 


00 


-i(|w  +£w  ) 
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t(|.U=  [prv,.ty)» 


i(£u  Hw  ) 
a r  y\,  ,t 

'  cu.  df  . 
or  <y 


■BO 


(340) 


Rectangular  Aperture 

Consider  the  rectangular  aperture  with  sides  a  and  b.  For  this  case 
the  aperture  function  is 


n<rty)  ■  R  (<r/a)  Rty/b)  (341) 

where  T(j-  )=J~[(or/a)  is  defined  by 


T(<r)  .  { 


0,  jcr|  >  a/2 
I  =  a/2 
1,  |<r|  <a/2. 


(342) 


This  can  be  rewritten  in  spatial-frequency  coordinates  by  defining 

f  =  <r  /v R  and  f  s  <y/v  R.  Hence, 
a  <r 


Tiy  y  =J_l(£(r‘'R/a)J— U^i-R/b). 


(343) 
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Slit  Aperture 

Investigate  the  point  spread  function  of  an  infinite  slit  and  an  infinite 
line  source.  This  i*  the  same  as  considering  a  one  dimensional  case 
where  we  have  a  point  source  and  a  one -dimensional  aperture.  The 
aperture  function  for  this  case  is 


T(c)  *n(a/D) 


(.147) 


where  D  is  the  aperture  width.  Written  in  spatial -frequency  coor¬ 
dinates  (347)  becomes 


T(f(r)  =  J1<fvR/D). 


(348) 


The  point  spread  function  is  the  square  of  the  Fourier  transform  of 

T(t  ). 

<r 


D/vR 


t(|)  =  vR/D)  e^df  *  f  ei2tTi°^df 


(349) 


-D/vR 


”(D/vR)sinc  (D|/vR). 
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The  normalized  intensity  is  then 


1 

t(0)  J 


sine  (D|A'R)« 


(350) 


Circular  Aperture 

For  mathematical  convenience  rewrite  the  two-dimensional  Fourier 
transform  pair  in  polar  coordinates.  Let 

f  *  f  cos  0,  |  =  r  cos  4>» 

(351) 

f  *  f  sin  0,  and  t,  =  r  sin  <J>. 

y 

To  make  the  transformation,  use  the  relationships  (Olmsted,  1956) 


(352} 


where  g(r,  4» )  =  f(4(r,  $),  £(r*  <1*  )  )  and 


ff h(fff, ty)d^r dfy  *  /r/k(fft»)|j(;,9)|dfd9 


(353) 
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where  h(f,  0  )  a  h(l^  (f*  0  ),  fy  (f»0  ))» 


J(r,*>  = 


S&bJrJ 


J(f,0'  a 


W^iy) 

Mf,e) 


Sr  cos  b 

Sr  cos  b 

d* 

S4» 

sin  4> 

Sr  sin  <f> 

hr 

s^ 

Sf  cos  0 

Sf  cos  0 

S9 

Sf  sin  0 

Sf  sin  0 

*f 

^9 

=  r* 


=  f. 


Making  these  substitutions  yields: 


t(r.  *)  ■  f  ["lit.  9)  el(r“C°'  *  C°’  ®  +  Ut  ““  *  ’in  e,fdld9, 
o  o 


2ir 
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If  T(f,  9  )  and  t(r,  $  )  are  symmetrical  and  have  no  dependency 
upon  9  and  <}>  we  can  write 


1  i 

2  i 

00 

-  2ir 

t(r)  *  f  T(f) 

U 

f  eiur  co»  W-9)de 
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00 

=  2ir  I*  T(f)  Jq  (ur)fdf 


(353) 
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T(f)  =  2ir  j'  t(r)  J^w^rdr. 
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jjgpE 


The  point  spread  function  of  intensity  is  the  square  of  the  two- 
dimensical  Fourier  transform  of  T(f). 

Let  a  be  the  radius  of  the  aperture  and  p  be  the  radial  coordinate  in 
the  aperture  plane.  The  spatial  frequency  is  f  =  p/vR. 

a/vR  2irra/vR 

t(r)  =  2n  I  JQ(<or)fdf  =  — ~  f  JQ(x)xdx 

«  2trr  •'  ° 


ffa2  pj^wra/vR) 


•ra 

2n2j 

v  R  1 


2iTra/vR 


(360) 
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The  normalized  intensity  is  then 


Consider  the  case  of  white  noise  where  the  point  sources  are 

i 

separated  by  the  Rayleigh  criterion  distance,  Y/e  will  assume  that  the 
number  of  measurements  S,  is  equal  to  a  multiple  of  the  number  of  point 
sources  km.  Our  problem  is  to  find  a  sampling  procedure  to  minimize 

I 

i 

tr  (A'A)"1.  (362) 


If  t  measurements  are  made  &  i  the  peak  of  each  point  spread  function 
then 
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Therefore,  for  m  point  sources 

tr  (A'A)  *  =  m/k  s  m^/  ,  (365) 

fr 

i 

1  ; 

We  want  to  show  that  these  are  the  conditions  necessary  to  minimise 
tr(A'A)  *  when  we  have  white  noise  and  Rayleigh  criterion  distances 
between  point  sources. 

To  prove  the  above  we  need  to  consider  several  lemmas. 

Lemma  1 

A ’A  is  a  symmetric  matrix. 

Proof: 

(A'A)*  *  A’(A’)’  -  A 'A  .  (366) 

I 

Lemma  2 

A'A  is  a  positive  definite  matrix. 

Proof:  By  definition,  A'A  is  a  positive  definite  matrix  if  <x,  AfAx>  >  0 
(Zadeh  and  Desoer,  1963)  where  <  >  denotes  an  inner  product.  Since 
A'A  is  a  symmetric  matrix  with  real  valued  elements,  it  is  a  Hermitian 
matrix.  Hence, 

(A'A)+  =  [(A'A)*]'  =  (A'A)'  =  A'A. 

I'f 


(367) 
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Consider  Schwarts’s  inequality 

<X,xXy,y>  >  <x,y><y,x>*|^c,y>|2  (368) 

where  equality  holds  if  and  only  if  y  =  cx  where  c  is  a  scalar  constant. 
Hence,  we  have 

<x,xXV,y>  >  <fc,y><y,X>  >  0.  (369) 

For  any  intelligent  estimation  procedure 

<x,x>  >  0  and  Jcj  >  0.  (370) 

For  y  =  cx  we  have 

x><y»  Y>  =  <J2*  yXx,  y>  S  J c  j 2  <x,  x><x,  x>  >  0  (371) 

which  implies  that 

<y,y>  >  0.  (372) 

Let  y  =  Ax,  then 


<y,y>  =  <Ax,  Ax>  =  <x,A+Ax>  =  <x,  A*Ax>  >  0. 


(373) 
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Now  consider  the  case  of  y  ^  cx.  We  now  have 

<x,xXy#y>  >  ^£,y>^£,y>  >  0 


hence. 


<y,y>>0. 


Let  y  =  Ax,  then 

<y,y>  =  ''Ax,  Ax>  =  *x,  A^Ax>  *  <x,  A*Ax>  >  0, 

Hence,  A*A  is  a  positive  definite  matrix. 

Lemma  3 

Let  A’A  =  B. 

.  ,  m 

|B|«bil|Bii|.^bjibUt|Bu.jk| 

where  |B..  I  is  the  cofactor  of  bM  in  B..  and  |B..  |  is 

*  u.jk  1  jk  n  1  u  1 

of  b..  in  B. 
u 

Proof? 


m  m 


(374) 

(375) 

(376) 

(377) 

the  cofactor 


(378) 
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Hence,  we  need  to  show 


kM 


(379) 


By  definition. 


iBjjl  *  (-l)i+j  My 


(380) 


and 


lBU- jfc  I  *  t-i),+l+',+k  Mu.jk 


(381) 


where  and  M..  are  complementary  minors  (Wylie,  I960).  For 
j  <  i  we  can  write 

bjil  Bjil  *  V-bn(-1>1+i+)+lMa.ji-bi2(-1)i+1+i+2Mu..2 

■'“+1)ViC-1)1W+Wi*,W.ji.,«UM<-i),+*+j+(,+1,Mil>.1+1 


+...+b  H)1+i+J+,X  .  ] 

i**  ii»  jmJ 


(382) 


*  V  *billBU.  J1 1  ~bi2 1  Bil*  j  2I  -  •  •  -bU-l  l»U.  I 


”bli+l  I  Bil*  ji+I  I  "binl  Bii*  jml  * 


m 


feii 
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For  j  >  i 


W  -  bii  'bU(-1>,+i+j+lMU.jl4,>i2(-I)ititi+2Mu.j2 


+  .  .  .  +b  (-l)1-fl^+(i'^M..  ..  -b.  (-l)i+i+j+(i+l,M. 

ii-X  ii‘ji-1  ii+1  '  ii.j 


-...-b  (-l)i+f+j+mM..,.  , 
in'  '  u*  jm] 


ji+1 

(383) 


=b..f-b.JB,.  I  ~b.,  I  B..  .,1  ,  ,  „ 

jil  il<  ii*jl*  i2  •  ii»j2l  .  -b..  ,  B..  , 

li-l  1  ii*  ji-1 


m 


Hence* 


-b„.  I B..  ...J  -b.  |B..  .  I  s -b.*  £  b.,  {B..  ,  I 

ii+l»  ii-.u+l*  in »  ix*  jm»  ji  ik«  ii*  jk » . 

H 


m 


b..|B..|  = 

ji»  ji« 


k=l 

Wi 


(384) 


-H'f 

ff 


and  therefore 


I  B 


m 


=  b..lB..|  -  S  b..b.,  IB..  ..I 
ii'  ii'  .  ji  ik'  n»jkl. 

Mi 


(385) 
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Lemma  4 

If  B  s  A’ A  is  a  symmetric,  real,  and  positive  definite  matrix  then 
B’1  is  positive  definite, 

-1  . 

Proof:  Given  <x,  Bx  >  >0,  Let  x  *  B  y,  then 

<x,  Bx>-  <B'1y,BB"1y>  -  <B_1y,y>  -  <V,B  *7  >>  °*  (386) 

Hence,  B~  *  is  a  positive  definite  matrix. 

Lemma  5 


]  „  ,  bJi  blk  l»U.1kl>  0 

j  ,k=l  J 

J*i 

k#i 


(387) 


for  b  and  b  .  not  equal  to  zero, 
ji  ik 

Proof:  The  matrix  B..  =  (A  .  A ^  .)  is  a  positive  definite  matrix 
since  the  matrix  B..  is  a  submatrix  whose  principle  diagonal  lies  along 
the  principle  diagonal  of  B.  Therefore,  |B..|  is  positive.  This  also 
implies  that  B.."1  is  positive  definite.  We  can  write 
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n 

E 

j,k«l 

k*i 


The  quantity  <b^,  b.  >  >  0  for  b,  1*0  since  B^.  is  a  positive 

definite  matrix.  Hence, 


A«i  V  ** 

ik 

m 


I  Bil  Ik  I  >  0 


(389) 


for  b..  and  b..  not  equal  to  zero, 
ji  ik 

Lemma  6 

If  B  =  (A*A)  is  a  symmetric  positive  definite  matrix* 

(a)  | B |  =  b..  j  B..  J  for  each  i  if  B  is  diagonal, 

(b)  |B  |  <  b..  |  B.J  for  at  least  one  value  of  i  if  B  is  not  diagonal. 
Proof: 

(a)  If  B  is  diagonal,  then  an  expansion  by  cofactors  along  the 


row  will  give  b..|B,  J.  Hence, 
n'  ji 
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(390) 


(b)  Using  lemmas  3  and  5  we  can  write  (assuming  that  b..  and  b., 

ji  xk 

are  the  off-diagonal  elements  which  are  not  equal  to  aero) 


B 


ii 


Z 

j,k-l 


IBii.11 1  <biilB 


ill 


(391) 


Hence, 


(392) 


Note:  By  hypotheses  B  is  not  a  diagonal  matrix;  hence,  there  are  at 
least  two  non-zero  symmetrically  located  off-diagonal  elements  (since 
B  is  symmetrical).  Assume  that  these  two  non-zero  elements  are  b 

ji 

and  b... 

*J 


.  Lemma  7 

B  is  a  symmetric,  positive  definite  matrix  with  diagonal  elements 
and  with  arbitrary  off-diagonal  elements  such  that  the  matrix  is  positive 
definite.  The  elements  along  the  principle  diagonal  of  B~*  will  he  a 


I 


minimum  01  l/k,  if  and  only  if  B  is  a  diagonal  matrix. 
Proof:  If  B  is  a  diagonal  matrix  then 


B  -  WV  - 


From  lemma  6 


l/kx  0 


o  •  l'k. 


0‘UliKiil‘ul 


since  >  0  and  |  B  *  >  0.  The  expression 


/  -i\  -i  / 1 

N  )  -B  ■Ur 


implies 


(393) 


(394) 


(395) 


(396) 


-1  I “ill 


(397) 
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Since 


-1  lBiil  (398) 

11  *  |B| 


then 


b„ 

u 


-1 


> 


(399) 


If  B  is  not  a  diagonal  matrix  then 


b.. 

ii 


-1 


(400) 


Hence,  if  B  is  not  a  diagonal  matrix  then 


(401) 


for  some  i  *  i  which  implies  that 


-1 


> 


(402) 
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Lemma  8 
m 

2  1/a  is  a  minimum  when  the  a's  are  equal  (i.  e. ,  a  *  a,  = 

•  •  •  1  w 

1=1 

m 

•  •  •  •  =a^  «  £  /m)  given  that  ^  a.  , 

Proof:  We  will  use  dynamic  programming  to  show  this  p.oof.  Given 
the  function 


fr  l 

i 

if 

f\ 

f'f 


f(a.«  a_,  .  •  •  i  t  )  =  ~  +  •  +  •  •  •  +  i  (403) 
l  c  m  a ,  a.  a 

12  m 


we  want  to  minimize  f(ajt .  •  »»m)  for  »j>.  0,  a2<>  0, . . . ,  and 

a  *  0  and  subject  to  the  constraint  a.  +a,  +  ...  +a  =* 
m  —  12m 

Hence  we  can  write 


min  min  JL  +  . ..  +  _L_  + 

0<a  <  1  a,>0  La,  a  .  a  J 

—  nr*  m  1—  1  m-1  m 


a  .-*0 
m-1 — 


min  I  ~  +  f  min  (r~  +  •  •  •  +  *—■—)]  I 

0<a  <  £  L  ®m  La.>0  al  Vl  JJ 

—  m—  m  1— 


(404). 


a  *,>0 
m-1— 


subject  to  aj+a2+  ...  +  a  .  =  £  -a  =  £  ,.  If  we  define 


m  m-1 


f  (t  )  ■  min  ("  —  +  f  .  (£  -a  )1 

■  ■  0<a<tL  am  "-1  m  m  J 

—  rr-  m 


(405) 
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we  can  write 


f. (£. )  ■  min  — 

OS.jiJj  *1 


(406) 


which  implies  that  aj  =  ^  and  *  fc2  ’  a2  ’  Also> 


f9(t9)  »  min 

0<a2<  t 


2 


+  *  ®in 
°<«2^2 


(407) 


To  minimize  (407)  we  require  that 


0  -  -l2  (t2-2.2) 


(408) 


and 


2*2  >  0 


(409) 


(sufficiency  condition  for  minimum)  which  implies  that  a  =  *■  /2. 

2  2 

Hence,  f 2(  Z 2)  =  4/  and  t  l  =  Z  2*a2  =  Zy  Z^l  =  Z 

which  implies  that  ax  =  Z =  a2  .  We  will  finish  the  proof  by  using 
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mathematical  introduction.  Assume  lemma  8  is  true  for  m-1  (i.  e. , 

a  a  a_  -  . . .  *  a  )  and  prove  true  for  m  (i.  e. ,  a  s  - 

1  2  m-1  1  6 

...a a  =  a  ).  Assuming  lemma  8  is  true  for  m-1  implies  th*t 

m- 1  m 


4  .  4  -a 

m-1  m  m 
a  ,  ■  - — v  *  t “ 
m-1  m-1  m-1 


(410) 


since  *  *2  ■  ••  •  »nd  »i+‘2  +  "-  +1m-l 

We  can  write 


s  4  «  —  4  —a  a 

m- 1  mm 


VV 


”ln  [-  +  f  .<fc  -a  )] 
0<a  <t  La  m-1  mm 
—  nr-  m  m 


min 

o<a  <4 
— ’  nr*  m 


2 

[4  -m  a  -2u»a  1 
m  m  .  m 

j 


a  m  m 


(411) 


For  minimization  we  require  that 


3f 


3a 


-  -t  2  +  ?a  l  +  <m2-2m)  a,.2 
a  m  mm  m 

m 


(412) 


and 


-  21 


m 


(m2-2*n)  ' a^ 


>  0  for  n  >  2 


(413) 
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(sufficiency  condition  for  minimum).  Hence, 


l  fl+(tn-l)] 
in  ** 

-m(m-2) 


(414) 


For  the  plus  sign  we  have  /(m-2)  <  0  which  does  not  meet 

the  requirement  of  a  >0.  For  the  minus  sign  we  have  a  -  l  /m>0. 

m  —  m  m 

We  can  then  write 


a.  +  a?  +  . , ,  +  a  .  =  (m-l)a  .  =  t  -  a  =  ma  -a  =  (m-l)a  (415) 
i  c  m-i  m-i  m  m  mm  m 


which  implies  that  a  =  a  .  Hence, 

m- 1  m 


al  =a2S-*-  =am-l=am*  (416) 

From  the  above  lemmas  we  can  see  that  in  order  to  minimise  the 
trace  of  B  1  =  (A'A)  1  we  must  make  the  off-diagonal  elements  equal 
to  zero.  This  implies  that  we  make  aJ»  our  measurements  at  peaks  of 
point  spread  functions.  In  doing  this  for  A'A  the  diagonal  elements 
become  larger  as  the  off-diagonal  elements  go  to  zero;  hence,  the  trace 
is  further  reduced.  From  the  last  lemma,  we  see  that  each  of  the  diagonal 
elements  of  A'A  must  be  equal  which  implies  that  the  same  number  of 
measurements  be  made  at  the  peak  of  each  point  spread  function. 
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